linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
@ 2017-04-06 14:31 Gionatan Danti
  2017-04-07  8:19 ` Mark Mielke
                   ` (2 more replies)
  0 siblings, 3 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-06 14:31 UTC (permalink / raw)
  To: linux-lvm

Hi all,
I'm seeking some advice for a new virtualization system (KVM) on top of 
LVM. The goal is to take agentless backups via LVM snapshots.

In short: what you suggest to snapshot a quite big (8+ TB) volume? 
Classic LVM (with old snapshot behavior) or thinlvm (and its new 
snapshot method)?

Long story:
In the past, I used classical, preallocated logical volumes directly 
exported as virtual disks. In this case, I snapshot the single LV I want 
to backup and, using dd/ddrescue, I copy it.

Problem is this solution prevents any use of thin allocation or sparse 
files, so I tried to replace it with something filesystem-based. Lately 
I used another approach, configuring a single thinly provisioned LV 
(with no zeroing) + XFS + raw or qcow2 virtual machine images. To make 
backups, I snapshotted the entire thin LV and, after mounting it, I 
copied the required files.

So far this second solution worked quite well. However, before using it 
in more and more installations, I wonder if it is the correct approach 
or if something better, especially from a stability standpoint, is possible.

Gived that I would like to use XFS, and that I need snapshots at the 
block level, two possibilities came to mind:

1) continue to use thinlvm + thin snapshots + XFS. What do you think 
about a 8+ TB thin pool/volume with relatively small (64/128KB) chunks? 
Would you be comfortable using it in production workloads? What about 
powerloss protection? From my understanding, thinlvm passes flushes 
anytime the higher layers issue them and so should be reasonable safe 
against unexpected powerloss. Is this view right?

2) use a classic (non-thin) LVM + normal snapshot + XFS. I know for sure 
that LV size is not an issue here, however big snapshot size used to be 
problematic: the CoW table had to be read completely before the snapshot 
can be activated. Is this problem a solved one? Or big snapshot can be 
problematic?

Thank you all.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-06 14:31 [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM Gionatan Danti
@ 2017-04-07  8:19 ` Mark Mielke
  2017-04-07  9:12   ` Gionatan Danti
  2017-04-07 18:21 ` Tomas Dalebjörk
  2017-04-13 10:20 ` Gionatan Danti
  2 siblings, 1 reply; 94+ messages in thread
From: Mark Mielke @ 2017-04-07  8:19 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 903 bytes --]

On Thu, Apr 6, 2017 at 10:31 AM, Gionatan Danti <g.danti@assyoma.it> wrote:

> I'm seeking some advice for a new virtualization system (KVM) on top of
> LVM. The goal is to take agentless backups via LVM snapshots.
>
> In short: what you suggest to snapshot a quite big (8+ TB) volume? Classic
> LVM (with old snapshot behavior) or thinlvm (and its new snapshot method)?
>

I found classic LVM snapshots to suffer terrible performance. I switched to
BTRFS as a result, until LVM thin pools became a real thing, and I happily
switched back.

I expect this depends on exactly what access patterns you have, how many
accesses will happen during the time the snapshot is held, and whether you
are using spindles or flash. Still, even with some attempt to be objective
and critical... I think I would basically never use classic LVM snapshots
for any purpose, ever.


-- 
Mark Mielke <mark.mielke@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1433 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-07  8:19 ` Mark Mielke
@ 2017-04-07  9:12   ` Gionatan Danti
  2017-04-07 13:50     ` L A Walsh
  2017-04-07 22:24     ` Mark Mielke
  0 siblings, 2 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-07  9:12 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Mark Mielke

Il 07-04-2017 10:19 Mark Mielke ha scritto:
> 
> I found classic LVM snapshots to suffer terrible performance. I
> switched to BTRFS as a result, until LVM thin pools became a real
> thing, and I happily switched back.

So you are now on lvmthin? Can I ask on what pool/volume/filesystem 
size?

> 
> I expect this depends on exactly what access patterns you have, how
> many accesses will happen during the time the snapshot is held, and
> whether you are using spindles or flash. Still, even with some attempt
> to be objective and critical... I think I would basically never use
> classic LVM snapshots for any purpose, ever.
> 

Sure, but for nightly backups reduced performance should not be a 
problem. Moreover, increasing snapshot chunk size (eg: from default 4K 
to 64K) gives much faster write performance.

I more concerned about lenghtly snapshot activation due to a big, linear 
CoW table that must be read completely...

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-07  9:12   ` Gionatan Danti
@ 2017-04-07 13:50     ` L A Walsh
  2017-04-07 16:33       ` Gionatan Danti
  2017-04-07 22:24     ` Mark Mielke
  1 sibling, 1 reply; 94+ messages in thread
From: L A Walsh @ 2017-04-07 13:50 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Mark Mielke

Gionatan Danti wrote:
> I more concerned about lenghtly snapshot activation due to a big, 
> linear CoW table that must be read completely...
---
    What is 'big'?  Are you just worried about the IO time?
If that's the case, much will depend on your HW.  Are we talking
using 8T hard disks concatenated into a single volume, or in a
RAID1, or what?  W/a HW-RAID10 getting over 1GB/s isn't
difficult for a contiguous read.  So how big is the CoW table
and how fragmented is it?  Even w/fragments, with enough spindles
you could still, likely, get enough I/O Ops where I/O speed shouldn't
be a critical bottleneck...

    However, regarding performance, I used to take daily snapshots
using normal LVM (before thin was available) w/rsync creating a
a difference volume between yesterday's snapshot and today's content.
On a 1TB volume @ ~75% full, it would take 45min - 1.5 hours to
create.  Multiplied by 8...backups wouldn't just be 'nightly'.
That was using about 12 data spindles.

    Unfortunately I've never benched the thin volumes.  Also,
they were NOT for backup purposes (those were separate using
xfsdump).  Besides performance and reliability, a main reason
to use snapshots was to provide "previous versions" of files to
windows clients.  That allowed quick recoveries from file-wiping
mistakes by opening the previous version of the file or
containing directory.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-07 13:50     ` L A Walsh
@ 2017-04-07 16:33       ` Gionatan Danti
  2017-04-13 12:59         ` Stuart Gathman
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-07 16:33 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Mark Mielke, L A Walsh

Il 07-04-2017 15:50 L A Walsh ha scritto:
> Gionatan Danti wrote:
>> I more concerned about lenghtly snapshot activation due to a big, 
>> linear CoW table that must be read completely...
> ---
>    What is 'big'?  Are you just worried about the IO time?
> If that's the case, much will depend on your HW.  Are we talking
> using 8T hard disks concatenated into a single volume, or in a
> RAID1, or what?  W/a HW-RAID10 getting over 1GB/s isn't
> difficult for a contiguous read.  So how big is the CoW table
> and how fragmented is it?  Even w/fragments, with enough spindles
> you could still, likely, get enough I/O Ops where I/O speed shouldn't
> be a critical bottleneck...

For the logical volume itself, I target a 8+ TB size. However, what 
worries me is *not* LV size by itself (I know that LVM can be used on 
volume much bigger than that), rather the snapshot CoW table. In short, 
from reading this list and from first-hand testing, big snapshots (20+ 
GB) require lenghtly activation, due to inefficiency in how classic 
metadata (ie: non thinly-provided) are layed out/used. However, I read 
that this was somewhat addressed lately. Do you have any insight?

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-06 14:31 [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM Gionatan Danti
  2017-04-07  8:19 ` Mark Mielke
@ 2017-04-07 18:21 ` Tomas Dalebjörk
  2017-04-13 10:20 ` Gionatan Danti
  2 siblings, 0 replies; 94+ messages in thread
From: Tomas Dalebjörk @ 2017-04-07 18:21 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2656 bytes --]

Hi

Agent less snapshot of the vm server might be an issue with application
running in the vm guest os.
Especially as there are no VSS like features on linux.

Perhaps someone can introduce a udev listener that can be used?

Den 6 apr. 2017 16:32 skrev "Gionatan Danti" <g.danti@assyoma.it>:

> Hi all,
> I'm seeking some advice for a new virtualization system (KVM) on top of
> LVM. The goal is to take agentless backups via LVM snapshots.
>
> In short: what you suggest to snapshot a quite big (8+ TB) volume? Classic
> LVM (with old snapshot behavior) or thinlvm (and its new snapshot method)?
>
> Long story:
> In the past, I used classical, preallocated logical volumes directly
> exported as virtual disks. In this case, I snapshot the single LV I want to
> backup and, using dd/ddrescue, I copy it.
>
> Problem is this solution prevents any use of thin allocation or sparse
> files, so I tried to replace it with something filesystem-based. Lately I
> used another approach, configuring a single thinly provisioned LV (with no
> zeroing) + XFS + raw or qcow2 virtual machine images. To make backups, I
> snapshotted the entire thin LV and, after mounting it, I copied the
> required files.
>
> So far this second solution worked quite well. However, before using it in
> more and more installations, I wonder if it is the correct approach or if
> something better, especially from a stability standpoint, is possible.
>
> Gived that I would like to use XFS, and that I need snapshots at the block
> level, two possibilities came to mind:
>
> 1) continue to use thinlvm + thin snapshots + XFS. What do you think about
> a 8+ TB thin pool/volume with relatively small (64/128KB) chunks? Would you
> be comfortable using it in production workloads? What about powerloss
> protection? From my understanding, thinlvm passes flushes anytime the
> higher layers issue them and so should be reasonable safe against
> unexpected powerloss. Is this view right?
>
> 2) use a classic (non-thin) LVM + normal snapshot + XFS. I know for sure
> that LV size is not an issue here, however big snapshot size used to be
> problematic: the CoW table had to be read completely before the snapshot
> can be activated. Is this problem a solved one? Or big snapshot can be
> problematic?
>
> Thank you all.
>
> --
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.danti@assyoma.it - info@assyoma.it
> GPG public key ID: FF5F32A8
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

[-- Attachment #2: Type: text/html, Size: 3544 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-07  9:12   ` Gionatan Danti
  2017-04-07 13:50     ` L A Walsh
@ 2017-04-07 22:24     ` Mark Mielke
  2017-04-08 11:56       ` Gionatan Danti
  1 sibling, 1 reply; 94+ messages in thread
From: Mark Mielke @ 2017-04-07 22:24 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 4672 bytes --]

On Fri, Apr 7, 2017 at 5:12 AM, Gionatan Danti <g.danti@assyoma.it> wrote:

> Il 07-04-2017 10:19 Mark Mielke ha scritto:
>
>>
>> I found classic LVM snapshots to suffer terrible performance. I
>> switched to BTRFS as a result, until LVM thin pools became a real
>> thing, and I happily switched back.
>>
>
> So you are now on lvmthin? Can I ask on what pool/volume/filesystem size?


We use lvmthin in many areas... from Docker's dm-thinp driver, to XFS file
systems for PostgreSQL or other data that need multiple snapshots,
including point-in-time backup of certain snapshots. Then, multiple sizes.
I don't know that we have 8 TB anywhere right this second, but we are using
it in a variety of ranges from 20 GB to 4 TB.


>
>> I expect this depends on exactly what access patterns you have, how
>> many accesses will happen during the time the snapshot is held, and
>> whether you are using spindles or flash. Still, even with some attempt
>> to be objective and critical... I think I would basically never use
>> classic LVM snapshots for any purpose, ever.
>>
>
> Sure, but for nightly backups reduced performance should not be a problem.
> Moreover, increasing snapshot chunk size (eg: from default 4K to 64K) gives
> much faster write performance.
>


When you say "nightly", my experience is that processes are writing data
all of the time. If the backup takes 30 minutes to complete, then this is
30 minutes of writes that get accumulated, and subsequent performance
overhead of these writes.

But, we usually keep multiple hourly snapshots and multiply daily
snapshots, because we want the option to recover to different points in
time. With the classic LVM snapshot capability, I believe this is
essentially non-functional. While it can work with "1 short lived
snapshot", I don't think it works at all well for "3 hourly + 3 daily
snapshots".  Remember that each write to an area will require that area to
be replicated multiple times under classic LVM snapshots, before the
original write can be completed. Every additional snapshot is an additional
cost.



> I more concerned about lenghtly snapshot activation due to a big, linear
> CoW table that must be read completely...



I suspect this is a pre-optimization concern, in that you are concerned,
and you are theorizing about impact, but perhaps you haven't measured it
yourself, and if you did, you would find there was no reason to be
concerned. :-)

If you absolutely need a contiguous sequence of blocks for your drives,
because your I/O patterns benefit from this, or because your hardware has
poor seek performance (such as, perhaps a tape drive? :-) ), then classic
LVM snapshots would retain this ordering for the live copy, and the
snapshot could be as short lived as possible to minimize overhead to only
that time period.

But, in practice - I think the LVM authors of the thinpool solution
selected a default block size that would exhibit good behaviour on most
common storage solutions. You can adjust it, but in most cases I think I
don't bother, and just use the default. There is also the behaviour of the
systems in general to take into account in that even if you had a purely
contiguous sequence of blocks, your file system probably allocates files
all over the drive anyways. With XFS, I believe they do this for
concurrency, in that two different kernel threads can allocate new files
without blocking each other, because they schedule the writes to two
different areas of the disk, with separate inode tables.

So, I don't believe the contiguous sequence of blocks is normally a real
thing. Perhaps a security camera that is recording a 1+ TB video stream
might allocate contiguous, but basically nothing else does this.

To me, LVM thin volumes is the right answer to this problem. It's not
particularly new or novel either. Most "Enterprise" level storage systems
have had this capability for many years. At work, we use NetApp and they
take this to another level with their WAFL = Write-Anywhere-File-Layout.
For our private cloud solution based upon NetApp AFF 8080EX today, we have
disk shelves filled with flash drives, and NetApp is writing everything
"forwards", which extends the life of the flash drives, and allows us to
keep many snapshots of the data. But, it doesn't have to be flash to take
advantage of this. We also have large NetApp FAS 8080EX or 8060 with all
spindles, including 3.5" SATA disks. I was very happy to see this type of
technology make it back into LVM. I think this breathed new life into LVM,
and made it a practical solution for many new use cases beyond being just a
more flexible partition manager.


-- 
Mark Mielke <mark.mielke@gmail.com>

[-- Attachment #2: Type: text/html, Size: 6224 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-07 22:24     ` Mark Mielke
@ 2017-04-08 11:56       ` Gionatan Danti
  0 siblings, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-08 11:56 UTC (permalink / raw)
  To: Mark Mielke; +Cc: LVM general discussion and development

Il 08-04-2017 00:24 Mark Mielke ha scritto:
> 
> We use lvmthin in many areas... from Docker's dm-thinp driver, to XFS
> file systems for PostgreSQL or other data that need multiple
> snapshots, including point-in-time backup of certain snapshots. Then,
> multiple sizes. I don't know that we have 8 TB anywhere right this
> second, but we are using it in a variety of ranges from 20 GB to 4 TB.
> 

Very interesting, this is the exact information I hoped to get. Thank 
you for reporting.

> 
> When you say "nightly", my experience is that processes are writing
> data all of the time. If the backup takes 30 minutes to complete, then
> this is 30 minutes of writes that get accumulated, and subsequent
> performance overhead of these writes.
> 
> But, we usually keep multiple hourly snapshots and multiply daily
> snapshots, because we want the option to recover to different points
> in time. With the classic LVM snapshot capability, I believe this is
> essentially non-functional. While it can work with "1 short lived
> snapshot", I don't think it works at all well for "3 hourly + 3 daily
> snapshots".  Remember that each write to an area will require that
> area to be replicated multiple times under classic LVM snapshots,
> before the original write can be completed. Every additional snapshot
> is an additional cost.

Right. For such a setup, classic LVM snapshot overhead would be 
enormous, grinding all to an halt.

> 
>> I more concerned about lenghtly snapshot activation due to a big,
>> linear CoW table that must be read completely...
> 
> I suspect this is a pre-optimization concern, in that you are
> concerned, and you are theorizing about impact, but perhaps you
> haven't measured it yourself, and if you did, you would find there was
> no reason to be concerned. :-)

For classic (non-thinly provided) LVM snapshot, relatively big metadata 
size was a know problem. Many talks happened on this list for this very 
topic. Basically, when the snapshot metadata size increased above a 
certain point (measured in some GB), snapshot activation failed due to 
timeout on LVM commands. This, in turn, was due that legacy snapshot 
behavior was not really tuned for long-lived, multi-gigabyte snapshots, 
rather for create-backup-remove behavior.

> 
> If you absolutely need a contiguous sequence of blocks for your
> drives, because your I/O patterns benefit from this, or because your
> hardware has poor seek performance (such as, perhaps a tape drive? :-)
> ), then classic LVM snapshots would retain this ordering for the live
> copy, and the snapshot could be as short lived as possible to minimize
> overhead to only that time period.
> 
> But, in practice - I think the LVM authors of the thinpool solution
> selected a default block size that would exhibit good behaviour on
> most common storage solutions. You can adjust it, but in most cases I
> think I don't bother, and just use the default. There is also the
> behaviour of the systems in general to take into account in that even
> if you had a purely contiguous sequence of blocks, your file system
> probably allocates files all over the drive anyways. With XFS, I
> believe they do this for concurrency, in that two different kernel
> threads can allocate new files without blocking each other, because
> they schedule the writes to two different areas of the disk, with
> separate inode tables.
> 
> So, I don't believe the contiguous sequence of blocks is normally a
> real thing. Perhaps a security camera that is recording a 1+ TB video
> stream might allocate contiguous, but basically nothing else does
> this.

True.

> 
> To me, LVM thin volumes is the right answer to this problem. It's not
> particularly new or novel either. Most "Enterprise" level storage
> systems have had this capability for many years. At work, we use
> NetApp and they take this to another level with their WAFL =
> Write-Anywhere-File-Layout. For our private cloud solution based upon
> NetApp AFF 8080EX today, we have disk shelves filled with flash
> drives, and NetApp is writing everything "forwards", which extends the
> life of the flash drives, and allows us to keep many snapshots of the
> data. But, it doesn't have to be flash to take advantage of this. We
> also have large NetApp FAS 8080EX or 8060 with all spindles, including
> 3.5" SATA disks. I was very happy to see this type of technology make
> it back into LVM. I think this breathed new life into LVM, and made it
> a practical solution for many new use cases beyond being just a more
> flexible partition manager.
> 
> --
> 
> Mark Mielke <mark.mielke@gmail.com>

Yeah, CoW-enabled filesystem are really cool ;) Too bad BTRFS has very 
low performance when used as VM backing store...

Thank you very much Mark, I really appreciate the information you 
provided.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-06 14:31 [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM Gionatan Danti
  2017-04-07  8:19 ` Mark Mielke
  2017-04-07 18:21 ` Tomas Dalebjörk
@ 2017-04-13 10:20 ` Gionatan Danti
  2017-04-13 12:41   ` Xen
  2 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-13 10:20 UTC (permalink / raw)
  To: linux-lvm

On 06/04/2017 16:31, Gionatan Danti wrote:
> Hi all,
> I'm seeking some advice for a new virtualization system (KVM) on top of
> LVM. The goal is to take agentless backups via LVM snapshots.
>
> In short: what you suggest to snapshot a quite big (8+ TB) volume?
> Classic LVM (with old snapshot behavior) or thinlvm (and its new
> snapshot method)?
>
> Long story:
> In the past, I used classical, preallocated logical volumes directly
> exported as virtual disks. In this case, I snapshot the single LV I want
> to backup and, using dd/ddrescue, I copy it.
>
> Problem is this solution prevents any use of thin allocation or sparse
> files, so I tried to replace it with something filesystem-based. Lately
> I used another approach, configuring a single thinly provisioned LV
> (with no zeroing) + XFS + raw or qcow2 virtual machine images. To make
> backups, I snapshotted the entire thin LV and, after mounting it, I
> copied the required files.
>
> So far this second solution worked quite well. However, before using it
> in more and more installations, I wonder if it is the correct approach
> or if something better, especially from a stability standpoint, is
> possible.
>
> Gived that I would like to use XFS, and that I need snapshots at the
> block level, two possibilities came to mind:
>
> 1) continue to use thinlvm + thin snapshots + XFS. What do you think
> about a 8+ TB thin pool/volume with relatively small (64/128KB) chunks?
> Would you be comfortable using it in production workloads? What about
> powerloss protection? From my understanding, thinlvm passes flushes
> anytime the higher layers issue them and so should be reasonable safe
> against unexpected powerloss. Is this view right?
>
> 2) use a classic (non-thin) LVM + normal snapshot + XFS. I know for sure
> that LV size is not an issue here, however big snapshot size used to be
> problematic: the CoW table had to be read completely before the snapshot
> can be activated. Is this problem a solved one? Or big snapshot can be
> problematic?
>
> Thank you all.
>

Hi,
anyone with other thoughts on the matter?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 10:20 ` Gionatan Danti
@ 2017-04-13 12:41   ` Xen
  2017-04-14  7:20     ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-13 12:41 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 13-04-2017 12:20:

> Hi,
> anyone with other thoughts on the matter?

I wondered why a single thin LV does work for you in terms of not 
wasting space or being able to make more efficient use of "volumes" or 
client volumes or whatever.

But a multitude of thin volumes won't.

See, you only compared multiple non-thin with a single-thin.

So my question is:

did you consider multiple thin volumes?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-07 16:33       ` Gionatan Danti
@ 2017-04-13 12:59         ` Stuart Gathman
  2017-04-13 13:52           ` Xen
  2017-04-14  7:23           ` Gionatan Danti
  0 siblings, 2 replies; 94+ messages in thread
From: Stuart Gathman @ 2017-04-13 12:59 UTC (permalink / raw)
  To: linux-lvm

Using a classic snapshot for backup does not normally involve activating
a large CoW.  I generally create a smallish snapshot (a few gigs),  that
will not fill up during the backup process.   If for some reason, a
snapshot were to fill up before backup completion, reads from the
snapshot get I/O errors (I've tested this), which alarms and aborts the
backup.  Yes, keeping a snapshot around and activating it at boot can be
a problem as the CoW gets large.

If you are going to keep snapshots around indefinitely, the thinpools
are probably the way to go.  (What happens when you fill up those? 
Hopefully it "freezes" the pool rather than losing everything.)

On 04/07/2017 12:33 PM, Gionatan Danti wrote:

> For the logical volume itself, I target a 8+ TB size. However, what
> worries me is *not* LV size by itself (I know that LVM can be used on
> volume much bigger than that), rather the snapshot CoW table. In
> short, from reading this list and from first-hand testing, big
> snapshots (20+ GB) require lenghtly activation, due to inefficiency in
> how classic metadata (ie: non thinly-provided) are layed out/used.
> However, I read that this was somewhat addressed lately. Do you have
> any insight?
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 12:59         ` Stuart Gathman
@ 2017-04-13 13:52           ` Xen
  2017-04-13 14:33             ` Zdenek Kabelac
  2017-04-14  7:23           ` Gionatan Danti
  1 sibling, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-13 13:52 UTC (permalink / raw)
  To: linux-lvm

Stuart Gathman schreef op 13-04-2017 14:59:

> If you are going to keep snapshots around indefinitely, the thinpools
> are probably the way to go.  (What happens when you fill up those?
> Hopefully it "freezes" the pool rather than losing everything.)

My experience is that the system crashes.

I have not tested this with a snapshot but a general thin pool overflow 
crashes the system.

Within half a minute, I think.

It is irrelevant whether the volumes had anything to do with the 
operation of the system; ie. some mounted volumes that you write to that 
are in no other use will crash the system.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 13:52           ` Xen
@ 2017-04-13 14:33             ` Zdenek Kabelac
  2017-04-13 14:47               ` Xen
                                 ` (2 more replies)
  0 siblings, 3 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-13 14:33 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 13.4.2017 v 15:52 Xen napsal(a):
> Stuart Gathman schreef op 13-04-2017 14:59:
> 
>> If you are going to keep snapshots around indefinitely, the thinpools
>> are probably the way to go.  (What happens when you fill up those?
>> Hopefully it "freezes" the pool rather than losing everything.)
> 
> My experience is that the system crashes.
> 
> I have not tested this with a snapshot but a general thin pool overflow 
> crashes the system.
> 
> Within half a minute, I think.
> 
> It is irrelevant whether the volumes had anything to do with the operation of 
> the system; ie. some mounted volumes that you write to that are in no other 
> use will crash the system.

Hello

Just let's repeat.

Full thin-pool is NOT in any way comparable to full filesystem.

Full filesystem has ALWAYS room for its metadata - it's not pretending it's 
bigger - it has 'finite' space and expect this space to just BE there.

Now when you have thin-pool - it cause quite a lot of trouble across number of 
layers.  There are solvable and being fixed.

But as the rule #1 still applies - do not run your thin-pool out of space - it 
will not always heal easily without losing date - there is not a simple 
straighforward way how to fix it (especially when user cannot ADD any new 
space he promised to have)

So monitoring pool and taking action ahead in time is always superior solution 
to any later  postmortem systems restores.


Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 14:33             ` Zdenek Kabelac
@ 2017-04-13 14:47               ` Xen
  2017-04-13 15:29               ` Stuart Gathman
  2017-04-14  7:27               ` Gionatan Danti
  2 siblings, 0 replies; 94+ messages in thread
From: Xen @ 2017-04-13 14:47 UTC (permalink / raw)
  To: linux-lvm

Zdenek Kabelac schreef op 13-04-2017 16:33:

> Hello
> 
> Just let's repeat.
> 
> Full thin-pool is NOT in any way comparable to full filesystem.
> 
> Full filesystem has ALWAYS room for its metadata - it's not pretending
> it's bigger - it has 'finite' space and expect this space to just BE
> there.
> 
> Now when you have thin-pool - it cause quite a lot of trouble across
> number of layers.  There are solvable and being fixed.
> 
> But as the rule #1 still applies - do not run your thin-pool out of
> space - it will not always heal easily without losing date - there is
> not a simple straighforward way how to fix it (especially when user
> cannot ADD any new space he promised to have)
> 
> So monitoring pool and taking action ahead in time is always superior
> solution to any later  postmortem systems restores.

Yes that's what I said. If your thin pool runs out, your system will 
crash.

Thanks for alluding that this will also happen if a thin snapshot causes 
this (obviously).

Regards.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 14:33             ` Zdenek Kabelac
  2017-04-13 14:47               ` Xen
@ 2017-04-13 15:29               ` Stuart Gathman
  2017-04-13 15:43                 ` Xen
  2017-04-14  7:27               ` Gionatan Danti
  2 siblings, 1 reply; 94+ messages in thread
From: Stuart Gathman @ 2017-04-13 15:29 UTC (permalink / raw)
  To: linux-lvm

On 04/13/2017 10:33 AM, Zdenek Kabelac wrote:
>
>
> Now when you have thin-pool - it cause quite a lot of trouble across
> number of layers.  There are solvable and being fixed.
>
> But as the rule #1 still applies - do not run your thin-pool out of
> space - it will not always heal easily without losing date - there is
> not a simple straighforward way how to fix it (especially when user
> cannot ADD any new space he promised to have) 
IMO, the friendliest thing to do is to freeze the pool in read-only mode
just before running out of metadata.  While still involving application
level data loss (the data it was just trying to write), and still
crashing the system (the system may be up and pingable and maybe even
sshable, but is "crashed" for normal purposes), it is simple to
understand and recover.   A sysadmin could have a plain LV for the
system volume, so that logs and stuff would still be kept, and admin
logins work normally.  There is no panic, as the data is there read-only.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 15:29               ` Stuart Gathman
@ 2017-04-13 15:43                 ` Xen
  2017-04-13 17:26                   ` Stuart D. Gathman
  2017-04-13 17:32                   ` Stuart D. Gathman
  0 siblings, 2 replies; 94+ messages in thread
From: Xen @ 2017-04-13 15:43 UTC (permalink / raw)
  To: linux-lvm

Stuart Gathman schreef op 13-04-2017 17:29:

> IMO, the friendliest thing to do is to freeze the pool in read-only 
> mode
> just before running out of metadata.

It's not about metadata but about physical extents.

In the thin pool.

> While still involving application
> level data loss (the data it was just trying to write), and still
> crashing the system (the system may be up and pingable and maybe even
> sshable, but is "crashed" for normal purposes)

Then it's not crashed. Only some application that may make use of the 
data volume may be crashed, but not the entire system.

The point is that errors and some filesystem that has errors=remount-ro, 
is okay.

If a regular snapshot that is mounted fills up, the mount is dropped.

System continues operating, as normal.

> , it is simple to
> understand and recover.   A sysadmin could have a plain LV for the
> system volume, so that logs and stuff would still be kept, and admin
> logins work normally.  There is no panic, as the data is there 
> read-only.

Yeah a system panic in terms of some volume becoming read-only is 
perfectly acceptable.

However the kernel going entirely mayhem, is not.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 15:43                 ` Xen
@ 2017-04-13 17:26                   ` Stuart D. Gathman
  2017-04-13 17:32                   ` Stuart D. Gathman
  1 sibling, 0 replies; 94+ messages in thread
From: Stuart D. Gathman @ 2017-04-13 17:26 UTC (permalink / raw)
  To: LVM general discussion and development

On Thu, 13 Apr 2017, Xen wrote:

> Stuart Gathman schreef op 13-04-2017 17:29:
>
>>  understand and recover.   A sysadmin could have a plain LV for the
>>  system volume, so that logs and stuff would still be kept, and admin
>>  logins work normally.  There is no panic, as the data is there read-only.
>
> Yeah a system panic in terms of some volume becoming read-only is perfectly 
> acceptable.
>
> However the kernel going entirely mayhem, is not.

Heh.  I was actually referring to *sysadmin* panic, not kernel panic.
:-)

But yeah, sysadmin panic can result in massive data loss...

-- 
 	      Stuart D. Gathman <stuart@gathman.org>
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 15:43                 ` Xen
  2017-04-13 17:26                   ` Stuart D. Gathman
@ 2017-04-13 17:32                   ` Stuart D. Gathman
  2017-04-14 15:17                     ` Xen
  1 sibling, 1 reply; 94+ messages in thread
From: Stuart D. Gathman @ 2017-04-13 17:32 UTC (permalink / raw)
  To: LVM general discussion and development

On Thu, 13 Apr 2017, Xen wrote:

> Stuart Gathman schreef op 13-04-2017 17:29:
>
>>  IMO, the friendliest thing to do is to freeze the pool in read-only mode
>>  just before running out of metadata.
>
> It's not about metadata but about physical extents.
>
> In the thin pool.

Ok.  My understanding is that *all* the volumes in the same thin-pool would 
have to be frozen when running out of extents, as writes all pull from
the same pool of physical extents.

-- 
 	      Stuart D. Gathman <stuart@gathman.org>
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 12:41   ` Xen
@ 2017-04-14  7:20     ` Gionatan Danti
  2017-04-14  8:24       ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14  7:20 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen

Il 13-04-2017 14:41 Xen ha scritto:
> 
> See, you only compared multiple non-thin with a single-thin.
> 
> So my question is:
> 
> did you consider multiple thin volumes?
> 

Hi, the multiple-thin-volume solution, while being very flexible, is not 
well understood by libvirt and virt-manager. So I need to pass on that 
(for the moment at least).

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 12:59         ` Stuart Gathman
  2017-04-13 13:52           ` Xen
@ 2017-04-14  7:23           ` Gionatan Danti
  2017-04-14 15:23             ` Xen
  1 sibling, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14  7:23 UTC (permalink / raw)
  To: LVM general discussion and development

Il 13-04-2017 14:59 Stuart Gathman ha scritto:
> Using a classic snapshot for backup does not normally involve 
> activating
> a large CoW.  I generally create a smallish snapshot (a few gigs),  
> that
> will not fill up during the backup process.   If for some reason, a
> snapshot were to fill up before backup completion, reads from the
> snapshot get I/O errors (I've tested this), which alarms and aborts the
> backup.  Yes, keeping a snapshot around and activating it at boot can 
> be
> a problem as the CoW gets large.
> 
> If you are going to keep snapshots around indefinitely, the thinpools
> are probably the way to go.  (What happens when you fill up those?
> Hopefully it "freezes" the pool rather than losing everything.)
> 

Hi, no need to keep snapshot around. If so, the classic LVM solution 
would be completely inadequate.

I simply worry that, with many virtual machines, even the temporary 
backup snapshot can fill up and cause some problem. When the snapshot 
fills, apart from it being dropped, there is anything I need to be 
worried about?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 14:33             ` Zdenek Kabelac
  2017-04-13 14:47               ` Xen
  2017-04-13 15:29               ` Stuart Gathman
@ 2017-04-14  7:27               ` Gionatan Danti
  2 siblings, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14  7:27 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen

Il 13-04-2017 16:33 Zdenek Kabelac ha scritto:
> 
> Hello
> 
> Just let's repeat.
> 
> Full thin-pool is NOT in any way comparable to full filesystem.
> 
> Full filesystem has ALWAYS room for its metadata - it's not pretending
> it's bigger - it has 'finite' space and expect this space to just BE
> there.
> 
> Now when you have thin-pool - it cause quite a lot of trouble across
> number of layers.  There are solvable and being fixed.
> 
> But as the rule #1 still applies - do not run your thin-pool out of
> space - it will not always heal easily without losing date - there is
> not a simple straighforward way how to fix it (especially when user
> cannot ADD any new space he promised to have)
> 
> So monitoring pool and taking action ahead in time is always superior
> solution to any later  postmortem systems restores.
> 

If I remember correctly, EXT4 with error=remount-ro should freeze the 
filesystem as soon as write errors are detected. Is this configuration 
safer than standard behavior? Do you know if XFS (RHEL *default* 
filesystem) supports something similar?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14  7:20     ` Gionatan Danti
@ 2017-04-14  8:24       ` Zdenek Kabelac
  2017-04-14  9:07         ` Gionatan Danti
  2017-04-22  7:14         ` Gionatan Danti
  0 siblings, 2 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-14  8:24 UTC (permalink / raw)
  To: LVM general discussion and development, Gionatan Danti; +Cc: Xen

Dne 14.4.2017 v 09:20 Gionatan Danti napsal(a):
> Il 13-04-2017 14:41 Xen ha scritto:
>>
>> See, you only compared multiple non-thin with a single-thin.
>>
>> So my question is:
>>
>> did you consider multiple thin volumes?
>>
> 
> Hi, the multiple-thin-volume solution, while being very flexible, is not well 
> understood by libvirt and virt-manager. So I need to pass on that (for the 
> moment at least).
> 


Well since recent versions of lvm2  (>=169 , even though they are marked as 
exprimental) - do support script execution of a command for easier maintanence
of thin-pool being filled above some percentage.

So it should be 'relatively' easy to setup a solution where you can fill
your pool to i.e. 90% and if gets above - kill your surrounding libvirt,
and resolve missing resources (deleting virt machines..)

But it's currently impossible to expect you will fill the thin-pool to full 
capacity and everything will continue to run smoothly - this is not going to 
happen.

However there are many different solutions for different problems - and with 
current script execution - user may build his own solution - i.e.  call
'dmsetup remove -f' for running thin volumes - so all instances get 'error' 
device   when pool is above some threshold setting (just like old 'snapshot' 
invalidation worked) - this way user will just kill thin volume user task, but 
will still keep thin-pool usable for easy maintenance.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14  8:24       ` Zdenek Kabelac
@ 2017-04-14  9:07         ` Gionatan Danti
  2017-04-14  9:37           ` Zdenek Kabelac
  2017-04-22  7:14         ` Gionatan Danti
  1 sibling, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14  9:07 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Xen, LVM general discussion and development

Il 14-04-2017 10:24 Zdenek Kabelac ha scritto:
> 
> But it's currently impossible to expect you will fill the thin-pool to
> full capacity and everything will continue to run smoothly - this is
> not going to happen.

Even with EXT4 and errors=remount-ro?

> 
> However there are many different solutions for different problems -
> and with current script execution - user may build his own solution -
> i.e.  call
> 'dmsetup remove -f' for running thin volumes - so all instances get
> 'error' device   when pool is above some threshold setting (just like
> old 'snapshot' invalidation worked) - this way user will just kill
> thin volume user task, but will still keep thin-pool usable for easy
> maintenance.
> 

Interesting. However, the main problem with libvirt is that its 
pool/volume management fall apart when used on thin-pools. Basically, 
libvirt does not understand that a thinpool is a container for thin 
volumes (ie: 
https://www.redhat.com/archives/libvirt-users/2014-August/msg00010.html)

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14  9:07         ` Gionatan Danti
@ 2017-04-14  9:37           ` Zdenek Kabelac
  2017-04-14  9:55             ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-14  9:37 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: Xen, LVM general discussion and development

Dne 14.4.2017 v 11:07 Gionatan Danti napsal(a):
> Il 14-04-2017 10:24 Zdenek Kabelac ha scritto:
>>
>> But it's currently impossible to expect you will fill the thin-pool to
>> full capacity and everything will continue to run smoothly - this is
>> not going to happen.
> 
> Even with EXT4 and errors=remount-ro?

While usage of 'remount-ro' may prevent any significant damage of filesystem
as such - since the 1st. problem detected by ext4 stops - it's still not quite 
trivial to proceed easily further.

The problem is not with 'stopping' access - but to gain the access back.

So in this case - you need to run 'fsck' - and this fsck usually needs more 
space - and the complexity starts with - where to get this space.

In the the 'most trivial' case - you have the space in 'VG' - you just extend 
thin-pool and you run 'fsck' and it works.

But then there is number of cases ending with the case that you run out of 
metadata space that has the maximal size of ~16G so you can't even extend it, 
simply because it's unsupported to use any bigger size.

So while every case has some way forward how to proceed - none of them could 
be easily automated.

And it's so much easier to monitor and prevent this to happen compared with 
solving these thing later.

So all is needed is - user is aware what he is using and does proper action 
and proper time.

> 
>>
>> However there are many different solutions for different problems -
>> and with current script execution - user may build his own solution -
>> i.e.  call
>> 'dmsetup remove -f' for running thin volumes - so all instances get
>> 'error' device   when pool is above some threshold setting (just like
>> old 'snapshot' invalidation worked) - this way user will just kill
>> thin volume user task, but will still keep thin-pool usable for easy
>> maintenance.
>>
> 
> Interesting. However, the main problem with libvirt is that its pool/volume 
> management fall apart when used on thin-pools. Basically, libvirt does not 
> understand that a thinpool is a container for thin volumes (ie: 
> https://www.redhat.com/archives/libvirt-users/2014-August/msg00010.html)

Well lvm2 provides the low-level tooling here....

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14  9:37           ` Zdenek Kabelac
@ 2017-04-14  9:55             ` Gionatan Danti
  0 siblings, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14  9:55 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Xen, LVM general discussion and development

Il 14-04-2017 11:37 Zdenek Kabelac ha scritto:
> The problem is not with 'stopping' access - but to gain the access 
> back.
> 
> So in this case - you need to run 'fsck' - and this fsck usually needs
> more space - and the complexity starts with - where to get this space.
> 
> In the the 'most trivial' case - you have the space in 'VG' - you just
> extend thin-pool and you run 'fsck' and it works.
> 
> But then there is number of cases ending with the case that you run
> out of metadata space that has the maximal size of ~16G so you can't
> even extend it, simply because it's unsupported to use any bigger
> size.
> 
> So while every case has some way forward how to proceed - none of them
> could be easily automated.

To better understand: what would be the (manual) solution here, if 
metadata are full and can not be extended due to the hard 16 GB limit?

> And it's so much easier to monitor and prevent this to happen compared
> with solving these thing later.
> 
> So all is needed is - user is aware what he is using and does proper
> action and proper time.
> 

Absolutely. However, monitoring can also fail - a safe failure model is 
a really important thing.

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-13 17:32                   ` Stuart D. Gathman
@ 2017-04-14 15:17                     ` Xen
  0 siblings, 0 replies; 94+ messages in thread
From: Xen @ 2017-04-14 15:17 UTC (permalink / raw)
  To: linux-lvm

Stuart D. Gathman schreef op 13-04-2017 19:32:
> On Thu, 13 Apr 2017, Xen wrote:
> 
>> Stuart Gathman schreef op 13-04-2017 17:29:
>> 
>>>  IMO, the friendliest thing to do is to freeze the pool in read-only 
>>> mode
>>>  just before running out of metadata.
>> 
>> It's not about metadata but about physical extents.
>> 
>> In the thin pool.
> 
> Ok.  My understanding is that *all* the volumes in the same thin-pool
> would have to be frozen when running out of extents, as writes all
> pull from
> the same pool of physical extents.

Yes, I simply tested with a small thin pool not used for anything else.

The volumes were not more than a few hundred megabytes big, so easy to 
fill up.

Putting a file copy to one of the volumes that the pool couldn't handle, 
the system quickly crashed.

Upon reboot it was neatly filled 100% and I could casually remove the 
volumes or whatever.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14  7:23           ` Gionatan Danti
@ 2017-04-14 15:23             ` Xen
  2017-04-14 15:53               ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-14 15:23 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 14-04-2017 9:23:
> Il 13-04-2017 14:59 Stuart Gathman ha scritto:
>> Using a classic snapshot for backup does not normally involve 
>> activating
>> a large CoW.  I generally create a smallish snapshot (a few gigs),  
>> that
>> will not fill up during the backup process.   If for some reason, a
>> snapshot were to fill up before backup completion, reads from the
>> snapshot get I/O errors (I've tested this), which alarms and aborts 
>> the
>> backup.  Yes, keeping a snapshot around and activating it at boot can 
>> be
>> a problem as the CoW gets large.
>> 
>> If you are going to keep snapshots around indefinitely, the thinpools
>> are probably the way to go.  (What happens when you fill up those?
>> Hopefully it "freezes" the pool rather than losing everything.)
>> 
> 
> Hi, no need to keep snapshot around. If so, the classic LVM solution
> would be completely inadequate.
> 
> I simply worry that, with many virtual machines, even the temporary
> backup snapshot can fill up and cause some problem. When the snapshot
> fills, apart from it being dropped, there is anything I need to be
> worried about?

A thin snapshot won't be dropped. It is allocated with the same size as 
the origin volume and hence can never fill up.

Only the pool itself can fill up but unless you have some monitoring 
software in place that can intervene in case of anomaly and kill the 
snapshot, your system will or may simply freeze and not drop anything.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 15:23             ` Xen
@ 2017-04-14 15:53               ` Gionatan Danti
  2017-04-14 16:08                 ` Stuart Gathman
  2017-04-14 17:36                 ` Xen
  0 siblings, 2 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14 15:53 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen

Il 14-04-2017 17:23 Xen ha scritto:
> A thin snapshot won't be dropped. It is allocated with the same size
> as the origin volume and hence can never fill up.
> 
> Only the pool itself can fill up but unless you have some monitoring
> software in place that can intervene in case of anomaly and kill the
> snapshot, your system will or may simply freeze and not drop anything.
> 

Yeah, I understand that. In that sentence, I was speaking about classic 
LVM snapshot.

The dilemma is:
- classic LVM snapshots have low performance (but adequate for backup 
purpose) and, if growing too much, snapshot activation can be 
problematic (especially on boot);
- thin-snapshots have much better performance but does not always fail 
gracefully (ie: pool full).

For nightly backups, what you would pick between the two?
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 15:53               ` Gionatan Danti
@ 2017-04-14 16:08                 ` Stuart Gathman
  2017-04-14 17:36                 ` Xen
  1 sibling, 0 replies; 94+ messages in thread
From: Stuart Gathman @ 2017-04-14 16:08 UTC (permalink / raw)
  To: linux-lvm

On 04/14/2017 11:53 AM, Gionatan Danti wrote:
>
> Yeah, I understand that. In that sentence, I was speaking about
> classic LVM snapshot.
>
> The dilemma is:
> - classic LVM snapshots have low performance (but adequate for backup
> purpose) and, if growing too much, snapshot activation can be
> problematic (especially on boot);
> - thin-snapshots have much better performance but does not always fail
> gracefully (ie: pool full).
>
> For nightly backups, what you would pick between the two? 
You've summarized it nicely.  I currently stick with classic snapshots
for nightly backups with smallish CoW (so in case backup somehow fails
to remove the snapshot, production performance doesn't suffer).

The failure model for classic snapshots is that if the CoW fills, the
snapshot is invalid (both read and write return IOerror), but otherwise
the system keeps humming along smoothly (with no more performance
penalty on the source volume). 

Before putting production volumes in a thinpool, the failure model needs
to be sane.  However much the admin is enjoined to never let the pool be
empty - it *will* happen.  Having the entire pool freeze in readonly
mode (without crashing the kernel) would be an acceptable failure mode. 
A more complex failure mode would be to have the other volumes in the
pool keep operating until they need a new extent - at which point they
too freeze.

With a readonly frozen pool, even in the case where metadata is also
full (so you can't add new extents), you can still add new storage and
copy logical volumes to a new pool (with more generous metadata and
chunk sizes).

It is not LVMs problem if the system crashes because a filesystem can't
handle a volume suddenly going readonly.  All filesystems used in a
thinpool should be able to handle that situation.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 15:53               ` Gionatan Danti
  2017-04-14 16:08                 ` Stuart Gathman
@ 2017-04-14 17:36                 ` Xen
  2017-04-14 18:59                   ` Gionatan Danti
  1 sibling, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-14 17:36 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 14-04-2017 17:53:
> Il 14-04-2017 17:23 Xen ha scritto:
>> A thin snapshot won't be dropped. It is allocated with the same size
>> as the origin volume and hence can never fill up.
>> 
>> Only the pool itself can fill up but unless you have some monitoring
>> software in place that can intervene in case of anomaly and kill the
>> snapshot, your system will or may simply freeze and not drop anything.
>> 
> 
> Yeah, I understand that. In that sentence, I was speaking about
> classic LVM snapshot.
> 
> The dilemma is:
> - classic LVM snapshots have low performance (but adequate for backup
> purpose) and, if growing too much, snapshot activation can be
> problematic (especially on boot);
> - thin-snapshots have much better performance but does not always fail
> gracefully (ie: pool full).
> 
> For nightly backups, what you would pick between the two?
> Thanks.

Oh, I'm sorry, I couldn't understand your message in that way.

I have a not very busy hobby server of sorts creating a snapshot every 
day, mounting it and exporting it via NFS with some backup host that 
will pull from it if everything keeps working ;-).

When I created the thing I thought that 1GB snapshot space would be 
enough; there should not be many logs and everything worth something is 
sitting on other partitions; so this is only the root volume and the 
/var/log directory so to speak.

To my surprise regularly the update script emails me that when it 
removed the root snapshot, it was not mounted.

When I log on during the day the snapshot is already half filled. I do 
not know what causes this. I cannot find any logs or anything else that 
would warrant such behaviour. But the best part of it all, is that the 
system never suffers.

The thing is just dismounted apparently; I don't even know what causes 
it.

The other volumes are thin. I am just very afraid of the thing filling 
up due to some runaway process or an error on my part.

If I have a 30GB volume and a 30GB snapshot of that volume, and if this 
volume is nearly empty and something starts filling it up, it will do 
twice the writes to the thin pool. Any damage done is doubled.

The only thing that could save you (me) at this point is a process 
instantly responding to some 90% full message and hoping it'd be in 
time. Of course I don't have this monitoring in place; everything 
requires work.

This is someone having written a script for Nagios:

https://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_lvm/details

Then someone else did the same for NewRelic:

https://discuss.newrelic.com/t/lvm-thin-pool-monitoring/29295/17

My version of LVM indicates only the following:

     # snapshot_library is the library used when monitoring a snapshot 
device.
     #
     # "libdevmapper-event-lvm2snapshot.so" monitors the filling of
     # snapshots and emits a warning through syslog when the use of
     # the snapshot exceeds 80%. The warning is repeated when 85%, 90% 
and
     # 95% of the snapshot is filled.

     snapshot_library = "libdevmapper-event-lvm2snapshot.so"

     # thin_library is the library used when monitoring a thin device.
     #
     # "libdevmapper-event-lvm2thin.so" monitors the filling of
     # pool and emits a warning through syslog when the use of
     # the pool exceeds 80%. The warning is repeated when 85%, 90% and
     # 95% of the pool is filled.

     thin_library = "libdevmapper-event-lvm2thin.so"

I'm sorry, I was trying to discover how to use journalctl to check for 
the message and it is just incredibly painful.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 17:36                 ` Xen
@ 2017-04-14 18:59                   ` Gionatan Danti
  2017-04-14 19:20                     ` Xen
                                       ` (2 more replies)
  0 siblings, 3 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-14 18:59 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen

Il 14-04-2017 19:36 Xen ha scritto:
> The thing is just dismounted apparently; I don't even know what causes 
> it.
> 

Maybe running "iotop -a" for some hours can point you to the right 
direction?

> The other volumes are thin. I am just very afraid of the thing filling
> up due to some runaway process or an error on my part.
> 
> If I have a 30GB volume and a 30GB snapshot of that volume, and if
> this volume is nearly empty and something starts filling it up, it
> will do twice the writes to the thin pool. Any damage done is doubled.
> 
> The only thing that could save you (me) at this point is a process
> instantly responding to some 90% full message and hoping it'd be in
> time. Of course I don't have this monitoring in place; everything
> requires work.

There is something similar already in place: when pool utilization is 
over 95%, lvmthin *should* try a (lazy) umount. Give a look here: 
https://www.redhat.com/archives/linux-lvm/2016-May/msg00042.html

Monitoring is a great thing; anyway, a safe fail policy would be *very* 
nice...

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 18:59                   ` Gionatan Danti
@ 2017-04-14 19:20                     ` Xen
  2017-04-15  8:27                       ` Xen
  2017-04-15 21:22                     ` Xen
  2017-04-15 21:48                     ` Xen
  2 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-14 19:20 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 14-04-2017 20:59:
> Il 14-04-2017 19:36 Xen ha scritto:
>> The thing is just dismounted apparently; I don't even know what causes 
>> it.
>> 
> 
> Maybe running "iotop -a" for some hours can point you to the right 
> direction?
> 
>> The other volumes are thin. I am just very afraid of the thing filling
>> up due to some runaway process or an error on my part.
>> 
>> If I have a 30GB volume and a 30GB snapshot of that volume, and if
>> this volume is nearly empty and something starts filling it up, it
>> will do twice the writes to the thin pool. Any damage done is doubled.
>> 
>> The only thing that could save you (me) at this point is a process
>> instantly responding to some 90% full message and hoping it'd be in
>> time. Of course I don't have this monitoring in place; everything
>> requires work.
> 
> There is something similar already in place: when pool utilization is
> over 95%, lvmthin *should* try a (lazy) umount. Give a look here:
> https://www.redhat.com/archives/linux-lvm/2016-May/msg00042.html

I even forgot about that. I have such bad memory.

Checking back, the host that I am now on uses LVM 111 (Debian 8). The 
next update is to... 111 ;-).

That was almost a year ago. You were using version 130 back then. I am 
still on 111 on Debian ;-).

Zdenek recommended 142 back then.

I could take it out of testing though. Version 168.


> Monitoring is a great thing; anyway, a safe fail policy would be *very* 
> nice...

A lazy umount does not invalidate any handles by processes for example 
having a directory open.

I believe there was an issue with the remount -o ro call? Taking too 
much resources for the daemon?

Anyway I am very happy that it happens if it happens; the umount.

I just don't feel comfortable about the system at all. I just don't want 
it to crash :p.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 19:20                     ` Xen
@ 2017-04-15  8:27                       ` Xen
  2017-04-15 23:35                         ` Xen
  2017-04-17 12:33                         ` Xen
  0 siblings, 2 replies; 94+ messages in thread
From: Xen @ 2017-04-15  8:27 UTC (permalink / raw)
  To: linux-lvm

Xen schreef op 14-04-2017 19:36:

> I'm sorry, I was trying to discover how to use journalctl to check for
> the message and it is just incredibly painful.

So this is how you find out the messages of a certain program by 
journalctl:

journalctl SYSLOG_IDENTIFIER=lvm

So user friendly ;-).

Then you need to mimick the behaviour of logtail and write your own 
program for it.

You can save the cursor in journalctl so as to do that.

journalctl SYSLOG_IDENTIFIER=lvm --show-cursor

and then use it with --cursor or --after-cursor, but I have no clue what 
the difference is.

I created a script to be run as a cron job that will email root in a 
pretty nice message.

http://www.xenhideout.nl/scripts/snapshot-check.sh.txt

I only made it for regular snapshot messages currently though, I have 
not yet seen fit to test the messages that 
libdevmapper-event-lvm2thin.so produces.

But this is the format you can expect:


Snapshot linux/root-snap has been unmounted from /srv/root.

Log message:

Apr 14 21:11:01 perfection lvm[463]: Unmounting invalid snapshot 
linux-root--snap from /srv/root.

Earlier messages:

Apr 14 19:25:31 perfection lvm[463]: Snapshot linux-root--snap is now 
81% full.
Apr 14 19:25:41 perfection lvm[463]: Snapshot linux-root--snap is now 
86% full.
Apr 14 21:10:51 perfection lvm[463]: Snapshot linux-root--snap is now 
93% full.
Apr 14 21:11:01 perfection lvm[463]: Snapshot linux-root--snap is now 
97% full.


I haven't yet fully tested everything but it saves the cursor in 
/run/lvm/lastlog ;-), and will not produce output when there is nothing 
new. It will not produce any output when run as a cron job and it will 
produce status messages when run interactively.

It has options like:
   -c : clear the cursor
   -u : update the cursor and nothing else
   -d : dry-run
   --help

There is not yet an option to send the email to stdout but if you botch 
up the script or emailing fails it will put the email to stderr so cron 
will pick it up.

I guess the next option is to not send an email or to send it to stdout 
instead (also picked up by cron normally).

In any case I assume current LVM already has ways of running scripts, 
but that depends on dmeventd....

So you could have different actions in the script as well and have it 
run other scripts.

Currently the action is simply to email...

Once I have tested thin fillup I will put that in too :p.

Regards.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 18:59                   ` Gionatan Danti
  2017-04-14 19:20                     ` Xen
@ 2017-04-15 21:22                     ` Xen
  2017-04-15 21:49                       ` Xen
  2017-04-15 21:48                     ` Xen
  2 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-15 21:22 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 14-04-2017 20:59:
> Il 14-04-2017 19:36 Xen ha scritto:
>> The thing is just dismounted apparently; I don't even know what causes 
>> it.
>> 
> 
> Maybe running "iotop -a" for some hours can point you to the right 
> direction?

I actually think it is enough if 225 extents get written. The snapshot 
is 924 MB or 250-25 extents.

I think it only needs to write in 225 different places on the disk (225 
different 4MB sectors) to fill the snapshot up.

Cause there is no way in hell that an actual 924 MB would be written, 
because the entire system is not more than 5GB and the entire systemd 
journal is not more than maybe 28 MB :p.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14 18:59                   ` Gionatan Danti
  2017-04-14 19:20                     ` Xen
  2017-04-15 21:22                     ` Xen
@ 2017-04-15 21:48                     ` Xen
  2017-04-18 10:17                       ` Zdenek Kabelac
  2 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-15 21:48 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 14-04-2017 20:59:

> There is something similar already in place: when pool utilization is
> over 95%, lvmthin *should* try a (lazy) umount. Give a look here:
> https://www.redhat.com/archives/linux-lvm/2016-May/msg00042.html
> 
> Monitoring is a great thing; anyway, a safe fail policy would be *very* 
> nice...

This is the idea I had back then:

- reserve space for calamities.

- when running out of space, start informing the filesystem(s).

- communicate individual unusable blocks or simple a number of 
unavailable blocks through some inter-layer communication system.

But it was said such channels do not exist or that the concept of a 
block device (a logical addressing space) suddenly having trouble 
delivering the blocks would be a conflicting concept.

If the concept of a filesystem needing to deal with disappearing space 
were to be made live,

what you would get was.

that there starts to grow some hidden block of unusable space.

Supposing that you have 3 volumes of sizes X Y and Z.

With the constraint that currently individually each volume is capable 
of using all space it wants,

now volume X starts to use up more space and the available remaining 
space is no longer enough for Z.

The space available to all volumes is equivalent and is only constrained 
by their own virtual sizes.

So saying that for each volume the available space = min( own filesystem 
space, available thin space )

any consumption by any of the other volumes will see a reduction of the 
available space by the same amount for the other volumes.

For the using volume this is to be expected, for the other volumes this 
is strange.

each consumption turns into a lessening for all the other volumes 
including the own.

this reduction of space is therefore a single number that pertains to 
all volumes and only comes to be in any kind of effect if the real 
available space is less than the (filesystem oriented, but rather LVM 
determined) virtual space the volume thought it had.

for all volumes that are effected, there is now a discrepancy between 
virtual available space and real available space.

this differs per volume but is really just a substraction. However LVM 
should be able to know about this number since it is just about a number 
of extents available and 'needed'.

Zdenek said that this information is not available in a live fashion 
because the algorithms that find a new free extent need to go look for 
it first.

Regardless if this information was available it could be communicated to 
the logical volume who could communicate it to the filesystem.

There are two ways: polling a number through some block device command 
or telling the filesystem through a daemon.

Remounting the filesystem read-only is one such "through a daemon" 
command.

Zdenek said that dmevent plugins cannot issue remount request because 
the system call is too big.

But it would be important that filesystem has feature for dealing with 
unavailable space for example by forcing it to reserve a certain amount 
of space in a live or dynamic fashion.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-15 21:22                     ` Xen
@ 2017-04-15 21:49                       ` Xen
  0 siblings, 0 replies; 94+ messages in thread
From: Xen @ 2017-04-15 21:49 UTC (permalink / raw)
  To: linux-lvm

Xen schreef op 15-04-2017 23:22:

> I actually think it is enough if 225 extents get written. The snapshot
> is 924 MB or 250-25 extents.

Erm, that's 256 - 25 = 231. My math is good today :p.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-15  8:27                       ` Xen
@ 2017-04-15 23:35                         ` Xen
  2017-04-17 12:33                         ` Xen
  1 sibling, 0 replies; 94+ messages in thread
From: Xen @ 2017-04-15 23:35 UTC (permalink / raw)
  To: linux-lvm

Xen schreef op 15-04-2017 10:27:

> http://www.xenhideout.nl/scripts/snapshot-check.sh.txt

My script now does thin pool reporting, at least for the data volume 
that I could check (tpool).

:p.

It can create messages such as this :p.



Thin is currently at 80%.

Log messages:

Apr 16 00:00:12 perfection lvm[463]: Thin linux-thin-tpool is now 80% 
full.
Apr 15 18:10:42 perfection lvm[463]: Thin linux-thin-tpool is now 95% 
full.
Apr 15 18:10:22 perfection lvm[463]: Thin linux-thin-tpool is now 92% 
full.

Previous messages:

Apr 15 14:38:12 perfection lvm[463]: Thin linux-thin-tpool is now 85% 
full.
Apr 15 14:37:12 perfection lvm[463]: Thin linux-thin-tpool is now 80% 
full.


The cursor of journalctl was at the 85% mark; that is why an earlier 
invocation would have shown the last 2 messages, while in this sense in 
this invocation the above three would be displayed and found.

(I copied an older cursor file over the cursor location).

So it shows all new messages when there is something new to be shown and 
it uses the occasion to also remind you of older messages.

Still working on something better...

But this is already quite nice.

Basically it sends 3 types of emails:

- snapshot filling up
- snapshot filled up completely
- thin pool filling up

But it only responds to dmevent messages in syslog. Of course you could 
take the opportunity to give much more detailed information which is 
what I am working on but this does require invocations of lvs etc.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-15  8:27                       ` Xen
  2017-04-15 23:35                         ` Xen
@ 2017-04-17 12:33                         ` Xen
  1 sibling, 0 replies; 94+ messages in thread
From: Xen @ 2017-04-17 12:33 UTC (permalink / raw)
  To: linux-lvm

Xen schreef op 15-04-2017 10:27:

> I created a script to be run as a cron job that will email root in a
> pretty nice message.
> 
> http://www.xenhideout.nl/scripts/snapshot-check.sh.txt

Just was so happy. I guess I can still improve the email, but:


Snapshot linux/root-snap has been unmounted from /srv/root because it 
filled up to a 100%.

Log message:

Apr 17 14:08:38 perfection lvm[463]: Unmounting invalid snapshot 
linux-root--snap from /srv/root.

Earlier messages:

Apr 17 14:08:21 perfection lvm[463]: Snapshot linux-root--snap is now 
96% full.
Apr 17 14:08:01 perfection lvm[463]: Snapshot linux-root--snap is now 
91% full.
Apr 17 14:07:51 perfection lvm[463]: Snapshot linux-root--snap is now 
86% full.
Apr 17 14:07:31 perfection lvm[463]: Snapshot linux-root--snap is now 
81% full.
-------------------------------------------------------------------------------
Apr 14 21:11:01 perfection lvm[463]: Snapshot linux-root--snap is now 
97% full.
Apr 14 21:10:51 perfection lvm[463]: Snapshot linux-root--snap is now 
93% full.
Apr 14 19:25:41 perfection lvm[463]: Snapshot linux-root--snap is now 
86% full.
Apr 14 19:25:31 perfection lvm[463]: Snapshot linux-root--snap is now 
81% full.


I was just upgrading packages hence the snapshot filled up quickly.


System works well. I don't get instant reports but if something happens 
within the space of 5 minutes it is too late anyway.

Only downside is that thin messages get repeated whenever snapshots are 
(re)created. So lvmetad will output new message for me at every 0:00. So 
if thin is > 80%, every day (for me) there is a new message for no 
reason in that sense.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-15 21:48                     ` Xen
@ 2017-04-18 10:17                       ` Zdenek Kabelac
  2017-04-18 13:23                         ` Gionatan Danti
  2017-04-19  7:22                         ` Xen
  0 siblings, 2 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-18 10:17 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 15.4.2017 v 23:48 Xen napsal(a):
> Gionatan Danti schreef op 14-04-2017 20:59:
> 
>> There is something similar already in place: when pool utilization is
>> over 95%, lvmthin *should* try a (lazy) umount. Give a look here:
>> https://www.redhat.com/archives/linux-lvm/2016-May/msg00042.html
>>
>> Monitoring is a great thing; anyway, a safe fail policy would be *very* nice...
> 
> This is the idea I had back then:
> 
> - reserve space for calamities.
> 
> - when running out of space, start informing the filesystem(s).
> 
> - communicate individual unusable blocks or simple a number of unavailable 
> blocks through some inter-layer communication system.
> 
> But it was said such channels do not exist or that the concept of a block 
> device (a logical addressing space) suddenly having trouble delivering the 
> blocks would be a conflicting concept.
> 
> If the concept of a filesystem needing to deal with disappearing space were to 
> be made live,
> 
> what you would get was.
> 
> that there starts to grow some hidden block of unusable space.
> 
> Supposing that you have 3 volumes of sizes X Y and Z.
> 
> With the constraint that currently individually each volume is capable of 
> using all space it wants,
> 
> now volume X starts to use up more space and the available remaining space is 
> no longer enough for Z.
> 
> The space available to all volumes is equivalent and is only constrained by 
> their own virtual sizes.
> 
> So saying that for each volume the available space = min( own filesystem 
> space, available thin space )
> 
> any consumption by any of the other volumes will see a reduction of the 
> available space by the same amount for the other volumes.
> 
> For the using volume this is to be expected, for the other volumes this is 
> strange.
> 
> each consumption turns into a lessening for all the other volumes including 
> the own.
> 
> this reduction of space is therefore a single number that pertains to all 
> volumes and only comes to be in any kind of effect if the real available space 
> is less than the (filesystem oriented, but rather LVM determined) virtual 
> space the volume thought it had.
> 
> for all volumes that are effected, there is now a discrepancy between virtual 
> available space and real available space.
> 
> this differs per volume but is really just a substraction. However LVM should 
> be able to know about this number since it is just about a number of extents 
> available and 'needed'.
> 
> Zdenek said that this information is not available in a live fashion because 
> the algorithms that find a new free extent need to go look for it first.

Already got lost in lots of posts.

But  there is tool  'thin_ls'  which can be used for detailed info about used 
space by every single  thin volume.

It's not support directly by 'lvm2' command (so not yet presented in shiny 
cool way via 'lvs -a') - but user can relatively easily run this command
on his own on life pool.


See for usage of


dmsetup message /dev/mapper/pool 0
     [ reserve_metadata_snap | release_metadata_snap ]

and 'man thin_ls'


Just don't forget to release snapshot of thin-pool kernel metadata once it's 
not needed...

> There are two ways: polling a number through some block device command or 
> telling the filesystem through a daemon.
> 
> Remounting the filesystem read-only is one such "through a daemon" command.
> 

Unmount of thin-pool has been dropped from upstream version >169.
It's now delegated to user script executed on % checkpoints
(see 'man dmeventd')

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-18 10:17                       ` Zdenek Kabelac
@ 2017-04-18 13:23                         ` Gionatan Danti
  2017-04-18 14:32                           ` Stuart D. Gathman
  2017-04-19  7:22                         ` Xen
  1 sibling, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-18 13:23 UTC (permalink / raw)
  To: LVM general discussion and development

On 18/04/2017 12:17, Zdenek Kabelac wrote:
> Unmount of thin-pool has been dropped from upstream version >169.
> It's now delegated to user script executed on % checkpoints
> (see 'man dmeventd')

Hi Zdenek,
I missed that; thanks.

Any thoughts on the original question? For snapshot with relatively big 
CoW table, from a stability standpoint, how do you feel about classical 
vs thin-pool snapshot?

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-18 13:23                         ` Gionatan Danti
@ 2017-04-18 14:32                           ` Stuart D. Gathman
  0 siblings, 0 replies; 94+ messages in thread
From: Stuart D. Gathman @ 2017-04-18 14:32 UTC (permalink / raw)
  To: LVM general discussion and development

On Tue, 18 Apr 2017, Gionatan Danti wrote:

> Any thoughts on the original question? For snapshot with relatively big CoW 
> table, from a stability standpoint, how do you feel about classical vs 
> thin-pool snapshot?

Classic snapshots are rock solid.  There is no risk to the origin
volume.  If the snapshot CoW fills up, all reads and all writes to the
*snapshot* return IOError.  The origin is unaffected.

If a classic snapshot exists across a reboot, then the entire CoW table
(but not the data chunks) must be loaded into memory when the snapshot 
(or origin) is activated.  This can greatly delay boot for a large CoW.

For the common purpose of temporary snapsnots for consistent backups,
this is not an issue.

-- 
 	      Stuart D. Gathman <stuart@gathman.org>
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-18 10:17                       ` Zdenek Kabelac
  2017-04-18 13:23                         ` Gionatan Danti
@ 2017-04-19  7:22                         ` Xen
  1 sibling, 0 replies; 94+ messages in thread
From: Xen @ 2017-04-19  7:22 UTC (permalink / raw)
  To: linux-lvm

Zdenek Kabelac schreef op 18-04-2017 12:17:

> Already got lost in lots of posts.
> 
> But  there is tool  'thin_ls'  which can be used for detailed info
> about used space by every single  thin volume.
> 
> It's not support directly by 'lvm2' command (so not yet presented in
> shiny cool way via 'lvs -a') - but user can relatively easily run this
> command
> on his own on life pool.
> 
> 
> See for usage of
> 
> 
> dmsetup message /dev/mapper/pool 0
>     [ reserve_metadata_snap | release_metadata_snap ]
> 
> and 'man thin_ls'
> 
> 
> Just don't forget to release snapshot of thin-pool kernel metadata
> once it's not needed...
> 
>> There are two ways: polling a number through some block device command 
>> or telling the filesystem through a daemon.
>> 
>> Remounting the filesystem read-only is one such "through a daemon" 
>> command.
>> 
> 
> Unmount of thin-pool has been dropped from upstream version >169.
> It's now delegated to user script executed on % checkpoints
> (see 'man dmeventd')

So I write something useless again ;-).

Always this issue with versions...

So Let's see, Debian Unstable (Sid) still has version 168 as does 
Testing (Stretch).
Ubuntu Zesty Zapus (17.04) has 167.

So for the foreseeable future both those distributions won't have that 
feature at least.

I heard you speak of those scripts yes but I did not know when or what 
yet, thanks.

I guess my script could be run directly from the script execution in the 
future then.

Thanks for responding though, much obliged.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-14  8:24       ` Zdenek Kabelac
  2017-04-14  9:07         ` Gionatan Danti
@ 2017-04-22  7:14         ` Gionatan Danti
  2017-04-22 16:32           ` Xen
  2017-04-22 21:22           ` Zdenek Kabelac
  1 sibling, 2 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-22  7:14 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Xen, LVM general discussion and development

Il 14-04-2017 10:24 Zdenek Kabelac ha scritto:
> However there are many different solutions for different problems -
> and with current script execution - user may build his own solution -
> i.e.  call
> 'dmsetup remove -f' for running thin volumes - so all instances get
> 'error' device   when pool is above some threshold setting (just like
> old 'snapshot' invalidation worked) - this way user will just kill
> thin volume user task, but will still keep thin-pool usable for easy
> maintenance.
> 

This is a very good idea - I tried it and it indeed works.

However, it is not very clear to me what is the best method to monitor 
the allocated space and trigger an appropriate user script (I understand 
that versione > .169 has %checkpoint scripts, but current RHEL 7.3 is on 
.166).

I had the following ideas:
1) monitor the syslog for the "WARNING pool is dd.dd% full" message;
2) set a higher than 0  low_water_mark and cache the dmesg/syslog 
"out-of-data" message;
3) register with device mapper to be notified.

What do you think is the better approach? If trying to register with 
device mapper, how can I accomplish that?

One more thing: from device-mapper docs (and indeed as observerd in my 
tests), the "pool is dd.dd% full" message is raised one single time: if 
a message is raised, the pool is emptied and refilled, no new messages 
are generated. The only method I found to let the system re-generate the 
message is to deactiveate and reactivate the thin pool itself.

Is this the correct method, or easier/better ones exist?

> But then there is number of cases ending with the case that
> you run out of metadata space that has the maximal size of
> ~16G so you can't even extend it, simply because it's
> unsupported to use any bigger size

Just out of curiosity, in such a case, how to proceed further to regain 
access to data?

And now the most burning question ... ;)
Given that thin-pool is under monitor and never allowed to fill 
data/metadata space, as do you consider its overall stability vs 
classical thick LVM?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-22  7:14         ` Gionatan Danti
@ 2017-04-22 16:32           ` Xen
  2017-04-22 20:58             ` Gionatan Danti
  2017-04-22 21:17             ` Zdenek Kabelac
  2017-04-22 21:22           ` Zdenek Kabelac
  1 sibling, 2 replies; 94+ messages in thread
From: Xen @ 2017-04-22 16:32 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM, development, Zdenek Kabelac

Gionatan Danti schreef op 22-04-2017 9:14:
> Il 14-04-2017 10:24 Zdenek Kabelac ha scritto:
>> However there are many different solutions for different problems -
>> and with current script execution - user may build his own solution -
>> i.e.  call
>> 'dmsetup remove -f' for running thin volumes - so all instances get
>> 'error' device   when pool is above some threshold setting (just like
>> old 'snapshot' invalidation worked) - this way user will just kill
>> thin volume user task, but will still keep thin-pool usable for easy
>> maintenance.
>> 
> 
> This is a very good idea - I tried it and it indeed works.

So a user script can execute dmsetup remove -f on the thin pool?

Oh no, for all volumes.

That is awesome, that means a errors=remount-ro mount will cause a 
remount right?

> However, it is not very clear to me what is the best method to monitor
> the allocated space and trigger an appropriate user script (I
> understand that versione > .169 has %checkpoint scripts, but current
> RHEL 7.3 is on .166).
> 
> I had the following ideas:
> 1) monitor the syslog for the "WARNING pool is dd.dd% full" message;

This is what my script is doing of course. It is a bit ugly and a bit 
messy by now, but I could still clean it up :p.

However it does not follow syslog, but checks periodically. You can also 
follow with -f.

It does not allow for user specified actions yet.

In that case it would fulfill the same purpose as > 169 only a bit more 
poverly.

> One more thing: from device-mapper docs (and indeed as observerd in my
> tests), the "pool is dd.dd% full" message is raised one single time:
> if a message is raised, the pool is emptied and refilled, no new
> messages are generated. The only method I found to let the system
> re-generate the message is to deactiveate and reactivate the thin pool
> itself.

This is not my experience on LVM 111 from Debian.

For me new messages are generated when:

- the pool reaches any threshold again
- I remove and recreate any thin volume.

Because my system regenerates snapshots, I now get an email from my 
script when the pool is > 80%, every day.

So if I keep the pool above 80%, every day at 0:00 I get an email about 
it :p. Because syslog gets a new entry for it. This is why I know :p.

> And now the most burning question ... ;)
> Given that thin-pool is under monitor and never allowed to fill
> data/metadata space, as do you consider its overall stability vs
> classical thick LVM?
> 
> Thanks.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-22 16:32           ` Xen
@ 2017-04-22 20:58             ` Gionatan Danti
  2017-04-22 21:17             ` Zdenek Kabelac
  1 sibling, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-22 20:58 UTC (permalink / raw)
  To: Xen; +Cc: LVM, development, Zdenek Kabelac

Il 22-04-2017 18:32 Xen ha scritto:
> This is not my experience on LVM 111 from Debian.
> 
> For me new messages are generated when:
> 
> - the pool reaches any threshold again
> - I remove and recreate any thin volume.
> 
> Because my system regenerates snapshots, I now get an email from my
> script when the pool is > 80%, every day.
> 
> So if I keep the pool above 80%, every day at 0:00 I get an email
> about it :p. Because syslog gets a new entry for it. This is why I
> know :p.
> 

Interesting, I had to try that ;)
Thanks for suggesting.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-22 16:32           ` Xen
  2017-04-22 20:58             ` Gionatan Danti
@ 2017-04-22 21:17             ` Zdenek Kabelac
  2017-04-23  5:29               ` Xen
  1 sibling, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-22 21:17 UTC (permalink / raw)
  To: Xen; +Cc: LVM general discussion and development

Dne 22.4.2017 v 18:32 Xen napsal(a):
> Gionatan Danti schreef op 22-04-2017 9:14:
>> Il 14-04-2017 10:24 Zdenek Kabelac ha scritto:
>>> However there are many different solutions for different problems -
>>> and with current script execution - user may build his own solution -
>>> i.e.  call
>>> 'dmsetup remove -f' for running thin volumes - so all instances get
>>> 'error' device   when pool is above some threshold setting (just like
>>> old 'snapshot' invalidation worked) - this way user will just kill
>>> thin volume user task, but will still keep thin-pool usable for easy
>>> maintenance.
>>>
>>
>> This is a very good idea - I tried it and it indeed works.
> 
> So a user script can execute dmsetup remove -f on the thin pool?
> 
> Oh no, for all volumes.
> 
> That is awesome, that means a errors=remount-ro mount will cause a remount right?

Well 'remount-ro' will fail but you will not be able to read anything
from volume as well.

So as said - many users many different solutions are needed...

Currently lvm2 can't support that much variety and complexity...


> 
>> However, it is not very clear to me what is the best method to monitor
>> the allocated space and trigger an appropriate user script (I
>> understand that versione > .169 has %checkpoint scripts, but current
>> RHEL 7.3 is on .166).
>>
>> I had the following ideas:
>> 1) monitor the syslog for the "WARNING pool is dd.dd% full" message;
> 
> This is what my script is doing of course. It is a bit ugly and a bit messy by 
> now, but I could still clean it up :p.
> 
> However it does not follow syslog, but checks periodically. You can also 
> follow with -f.
> 
> It does not allow for user specified actions yet.
> 
> In that case it would fulfill the same purpose as > 169 only a bit more poverly.
> 
>> One more thing: from device-mapper docs (and indeed as observerd in my
>> tests), the "pool is dd.dd% full" message is raised one single time:
>> if a message is raised, the pool is emptied and refilled, no new
>> messages are generated. The only method I found to let the system
>> re-generate the message is to deactiveate and reactivate the thin pool
>> itself.
> 
> This is not my experience on LVM 111 from Debian.
> 
> For me new messages are generated when:
> 
> - the pool reaches any threshold again
> - I remove and recreate any thin volume.
> 
> Because my system regenerates snapshots, I now get an email from my script 
> when the pool is > 80%, every day.
> 
> So if I keep the pool above 80%, every day at 0:00 I get an email about it :p. 
> Because syslog gets a new entry for it. This is why I know :p.

The explanation here is simple - when you create a new thinLV - there is 
currently full suspend - and before 'suspend' pool is 'unmonitored'
after resume again monitored - and you get your warning logged again.


Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-22  7:14         ` Gionatan Danti
  2017-04-22 16:32           ` Xen
@ 2017-04-22 21:22           ` Zdenek Kabelac
  2017-04-24 13:49             ` Gionatan Danti
  1 sibling, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-22 21:22 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

Dne 22.4.2017 v 09:14 Gionatan Danti napsal(a):
> Il 14-04-2017 10:24 Zdenek Kabelac ha scritto:
>> However there are many different solutions for different problems -
>> and with current script execution - user may build his own solution -
>> i.e.  call
>> 'dmsetup remove -f' for running thin volumes - so all instances get
>> 'error' device   when pool is above some threshold setting (just like
>> old 'snapshot' invalidation worked) - this way user will just kill
>> thin volume user task, but will still keep thin-pool usable for easy
>> maintenance.
>>
> 
> This is a very good idea - I tried it and it indeed works.
> 
> However, it is not very clear to me what is the best method to monitor the 
> allocated space and trigger an appropriate user script (I understand that 
> versione > .169 has %checkpoint scripts, but current RHEL 7.3 is on .166).
> 
> I had the following ideas:
> 1) monitor the syslog for the "WARNING pool is dd.dd% full" message;
> 2) set a higher than 0  low_water_mark and cache the dmesg/syslog 
> "out-of-data" message;
> 3) register with device mapper to be notified.
> 
> What do you think is the better approach? If trying to register with device 
> mapper, how can I accomplish that?
> 
> One more thing: from device-mapper docs (and indeed as observerd in my tests), 
> the "pool is dd.dd% full" message is raised one single time: if a message is 
> raised, the pool is emptied and refilled, no new messages are generated. The 
> only method I found to let the system re-generate the message is to 
> deactiveate and reactivate the thin pool itself.


ATM there is even bug for 169 & 170 - dmeventd should generate message
at 80,85,90,95,100 - but it does it only once - will be fixed soon...

>> ~16G so you can't even extend it, simply because it's
>> unsupported to use any bigger size
> 
> Just out of curiosity, in such a case, how to proceed further to regain access 
> to data?
> 
> And now the most burning question ... ;)
> Given that thin-pool is under monitor and never allowed to fill data/metadata 
> space, as do you consider its overall stability vs classical thick LVM?

Not seen metadata error for quite long time...
Since all the updates are CRC32 protected it's quite solid.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-22 21:17             ` Zdenek Kabelac
@ 2017-04-23  5:29               ` Xen
  2017-04-23  9:26                 ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-23  5:29 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

Zdenek Kabelac schreef op 22-04-2017 23:17:

>> That is awesome, that means a errors=remount-ro mount will cause a 
>> remount right?
> 
> Well 'remount-ro' will fail but you will not be able to read anything
> from volume as well.

Well that is still preferable to anything else.

It is preferable to a system crash, I mean.

So if there is no other last rather, I think this is really the only 
last resort that exists?

Or maybe one of the other things Gionatan suggested.

> Currently lvm2 can't support that much variety and complexity...

I think it's simpler but okay, sure...

I think pretty much anyone would prefer a volume-read-errors system 
rather than a kernel-hang system.

It is just not of the same magnitude of disaster :p.

> The explanation here is simple - when you create a new thinLV - there
> is currently full suspend - and before 'suspend' pool is 'unmonitored'
> after resume again monitored - and you get your warning logged again.

Right, yes, that's what syslog says.

It does make it a bit annoying to be watching for messages but I guess 
it means filtering for the monitoring messages too.

If you want to filter out the recurring message, or check current thin 
pool usage before you send anything.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-23  5:29               ` Xen
@ 2017-04-23  9:26                 ` Zdenek Kabelac
  2017-04-24 21:02                   ` Xen
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-23  9:26 UTC (permalink / raw)
  To: Xen; +Cc: LVM general discussion and development

Dne 23.4.2017 v 07:29 Xen napsal(a):
> Zdenek Kabelac schreef op 22-04-2017 23:17:
> 
>>> That is awesome, that means a errors=remount-ro mount will cause a remount 
>>> right?
>>
>> Well 'remount-ro' will fail but you will not be able to read anything
>> from volume as well.
> 
> Well that is still preferable to anything else.
> 
> It is preferable to a system crash, I mean.
> 
> So if there is no other last rather, I think this is really the only last 
> resort that exists?
> 
> Or maybe one of the other things Gionatan suggested.
> 
>> Currently lvm2 can't support that much variety and complexity...
> 
> I think it's simpler but okay, sure...
> 
> I think pretty much anyone would prefer a volume-read-errors system rather 
> than a kernel-hang system.

I'm just currious -  what the you think will happen when you have
root_LV as thin LV and thin pool runs out of space - so 'root_LV'
is replaced with 'error' target.

How do you think this will be ANY different from hanging your system ?


> It is just not of the same magnitude of disaster :p.

IMHO reboot is still quite fair solution in such case.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-22 21:22           ` Zdenek Kabelac
@ 2017-04-24 13:49             ` Gionatan Danti
  2017-04-24 14:48               ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-24 13:49 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development



On 22/04/2017 23:22, Zdenek Kabelac wrote:
> ATM there is even bug for 169 & 170 - dmeventd should generate message
> at 80,85,90,95,100 - but it does it only once - will be fixed soon...

Mmm... quite a bug, considering how important is monitoring. All things 
considered, what do you feel is the better approach to monitor? It is 
possibile to register for dmevents?

> Not seen metadata error for quite long time...
> Since all the updates are CRC32 protected it's quite solid.

Great! Are the metadata writes somehow jounaled or are written in-place?

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-24 13:49             ` Gionatan Danti
@ 2017-04-24 14:48               ` Zdenek Kabelac
  0 siblings, 0 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-24 14:48 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

Dne 24.4.2017 v 15:49 Gionatan Danti napsal(a):
> 
> 
> On 22/04/2017 23:22, Zdenek Kabelac wrote:
>> ATM there is even bug for 169 & 170 - dmeventd should generate message
>> at 80,85,90,95,100 - but it does it only once - will be fixed soon...
> 
> Mmm... quite a bug, considering how important is monitoring. All things 
> considered, what do you feel is the better approach to monitor? It is 
> possibile to register for dmevents?

Not all that big once - you get 1 WARNING always.
And releases  169 & 170 are clearly marked as developer releases - so they are 
meant for testing and discovering these bugs...

>> Not seen metadata error for quite long time...
>> Since all the updates are CRC32 protected it's quite solid.
> 
> Great! Are the metadata writes somehow jounaled or are written in-place?

Surely there is journal


Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-23  9:26                 ` Zdenek Kabelac
@ 2017-04-24 21:02                   ` Xen
  2017-04-24 21:59                     ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Xen @ 2017-04-24 21:02 UTC (permalink / raw)
  To: linux-lvm

Zdenek Kabelac schreef op 23-04-2017 11:26:

> I'm just currious -  what the you think will happen when you have
> root_LV as thin LV and thin pool runs out of space - so 'root_LV'
> is replaced with 'error' target.

Why do you suppose Root LV is on thin?

Why not just stick to the common scenario when thin is used for extra 
volumes or data?

I mean to say that you are raising an exceptional situation as an 
argument against something that I would consider quite common, which 
doesn't quite work that way: you can't prove that most people would not 
want something by raising something most people wouldn't use.

I mean to say let's just look at the most common denominator here.

Root LV on thin is not that.

I have tried it, yes. Gives troubles with Grub and requires thin package 
to be installed on all systems and makes it harder to install a system 
too.

Thin root LV is not the idea for most people.

So again, don't you think having data volumes produce errors is not 
preferable to having the entire system hang?

> How do you think this will be ANY different from hanging your system ?

Doesn't happen cause you're not using that.

You're smarter than that.

So it doesn't happen and it's not a use case here.

> IMHO reboot is still quite fair solution in such case.

That's irrelevant; if the thin pool is full you need to mitigate it, 
rebooting won't help with that.

And if your root is on thin, rebooting won't do you much good either. So 
you had best keep a running system in which you can mitigate it live 
instead of rebooting to avail.

That's just my opinion and a lot more commonsensical than what you just 
said, I think.

But to each his own.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-24 21:02                   ` Xen
@ 2017-04-24 21:59                     ` Zdenek Kabelac
  2017-04-26  7:26                       ` Gionatan Danti
  2018-02-27 18:39                       ` Xen
  0 siblings, 2 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-24 21:59 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 24.4.2017 v 23:02 Xen napsal(a):
> Zdenek Kabelac schreef op 23-04-2017 11:26:
> 
>> I'm just currious -  what the you think will happen when you have
>> root_LV as thin LV and thin pool runs out of space - so 'root_LV'
>> is replaced with 'error' target.
> 
> Why do you suppose Root LV is on thin?
> 
> Why not just stick to the common scenario when thin is used for extra volumes 
> or data?
> 
> I mean to say that you are raising an exceptional situation as an argument 
> against something that I would consider quite common, which doesn't quite work 
> that way: you can't prove that most people would not want something by raising 
> something most people wouldn't use.
> 
> I mean to say let's just look at the most common denominator here.
> 
> Root LV on thin is not that.

Well then you might be surprised - there are user using exactly this.

When you have rootLV on thinLV - you could easily snapshot it before doing any 
upgrade and revert back in case something fails on upgrade.
See also projects like snapper...

> 
> I have tried it, yes. Gives troubles with Grub and requires thin package to be 
> installed on all systems and makes it harder to install a system too.

lvm2 is cooking some better boot support atm....



> Thin root LV is not the idea for most people.
> 
> So again, don't you think having data volumes produce errors is not preferable 
> to having the entire system hang?

Not sure why you insist system hangs.

If system hangs - and you have recent kernel & lvm2 - you should fill bug.

If you set  '--errorwhenfull y'  - it should instantly fail.

There should not be any hanging..


> That's irrelevant; if the thin pool is full you need to mitigate it, rebooting 
> won't help with that.

well it's really admins task to solve the problem after panic call.
(adding new space).

Thin users can't expect to overload system in crazy way and expect the system 
will easily do something magical to restore all data.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-24 21:59                     ` Zdenek Kabelac
@ 2017-04-26  7:26                       ` Gionatan Danti
  2017-04-26  7:42                         ` Zdenek Kabelac
  2018-02-27 18:39                       ` Xen
  1 sibling, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-26  7:26 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen, zkabelac

Il 24-04-2017 23:59 Zdenek Kabelac ha scritto:
> If you set  '--errorwhenfull y'  - it should instantly fail.

It's my understanding that "--errorwhenfull y" will instantly fail 
writes which imply new allocation requests, but writes to 
already-allocated space will be completed.

It is possible, without messing directly with device mapper (via 
dmsetup), to configure a strict "read-only" policy, where *all* writes 
(both to allocated or not allocated space) will fail?

It is not possible to do via lvm tools, what/how device-mapper target 
should be used?
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26  7:26                       ` Gionatan Danti
@ 2017-04-26  7:42                         ` Zdenek Kabelac
  2017-04-26  8:10                           ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-26  7:42 UTC (permalink / raw)
  To: Gionatan Danti, LVM general discussion and development

Dne 26.4.2017 v 09:26 Gionatan Danti napsal(a):
> Il 24-04-2017 23:59 Zdenek Kabelac ha scritto:
>> If you set  '--errorwhenfull y'  - it should instantly fail.
> 
> It's my understanding that "--errorwhenfull y" will instantly fail writes 
> which imply new allocation requests, but writes to already-allocated space 
> will be completed.

yes you understand it properly.

> 
> It is possible, without messing directly with device mapper (via dmsetup), to 
> configure a strict "read-only" policy, where *all* writes (both to allocated 
> or not allocated space) will fail?

Nope it's not.

> 
> It is not possible to do via lvm tools, what/how device-mapper target should 
> be used?

At this moment it's not possible.
I do have some plans/idea how to workaround this in user-space but it's 
non-trivial - especially on recovery path.

It would be possible to 'reroute' thin to dm-delay and then write path to 
error and read path leave as is - but it's adding many new states to handle,
to ATM it's in queue...

Using 'ext4' with remount-ro is fairly easy to setup and get exactly this
logic.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26  7:42                         ` Zdenek Kabelac
@ 2017-04-26  8:10                           ` Gionatan Danti
  2017-04-26 11:23                             ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-26  8:10 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

Il 26-04-2017 09:42 Zdenek Kabelac ha scritto:
> At this moment it's not possible.
> I do have some plans/idea how to workaround this in user-space but
> it's non-trivial - especially on recovery path.
> 
> It would be possible to 'reroute' thin to dm-delay and then write path
> to error and read path leave as is - but it's adding many new states
> to handle,
> to ATM it's in queue...

Good to know. Thank you.

> Using 'ext4' with remount-ro is fairly easy to setup and get exactly 
> this
> logic.

I'm not sure this is sufficient. In my testing, ext4 will *not* 
remount-ro on any error, rather only on erroneous metadata updates. For 
example, on a thinpool with "--errorwhenfull y", trying to overcommit 
data with a simple "dd if=/dev/zero of=/mnt/thinvol bs=1M count=1024 
oflag=sync" will cause I/O errors (as shown by dmesg), but the 
filesystem is *not* immediately remounted read-only. Rather, after some 
time, a failed journal update will remount it read-only.

XFS should behave similarly, with the exception that it will shutdown 
the entire filesystem (ie: not even reads are allowed) when metadata 
errors are detected (see note n.1).

The problem is that, as filesystem often writes its own metadata to 
already-allocated disk space, the out-of-space condition (and relative 
filesystem shutdown) will take some time to be recognized.

Note n.1
 From RED HAT STORAGE ADMINISTRATION GUIDE 
(https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Storage_Administration_Guide/ch06s09.html#idp17392328):

Metadata error behavior
The ext3/4 file system has configurable behavior when metadata errors 
are encountered, with the default being to simply continue. When XFS 
encounters a metadata error that is not recoverable it will shut down 
the file system and return a EFSCORRUPTED error. The system logs will 
contain details of the error enountered and will recommend running 
xfs_repair if necessary.


-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26  8:10                           ` Gionatan Danti
@ 2017-04-26 11:23                             ` Zdenek Kabelac
  2017-04-26 13:37                               ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-26 11:23 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

Dne 26.4.2017 v 10:10 Gionatan Danti napsal(a):
> 
> I'm not sure this is sufficient. In my testing, ext4 will *not* remount-ro on 
> any error, rather only on erroneous metadata updates. For example, on a 
> thinpool with "--errorwhenfull y", trying to overcommit data with a simple "dd 
> if=/dev/zero of=/mnt/thinvol bs=1M count=1024 oflag=sync" will cause I/O 
> errors (as shown by dmesg), but the filesystem is *not* immediately remounted 
> read-only. Rather, after some time, a failed journal update will remount it 
> read-only.

You need to use 'direct' write more - otherwise you are just witnessing issues 
related with 'page-cache' flushing.

Every update of file means update of journal - so you surely can lose some 
data in-flight - but every good software needs to the flush before doing next 
transaction - so with correctly working transaction software no data could be 
lost.

> 
> XFS should behave similarly, with the exception that it will shutdown the 
> entire filesystem (ie: not even reads are allowed) when metadata errors are 
> detected (see note n.1).

Yep - XFS is slightly different - but it gets improved, however some new 
features are not enabled by default and user needs to enabled them.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26 11:23                             ` Zdenek Kabelac
@ 2017-04-26 13:37                               ` Gionatan Danti
  2017-04-26 14:33                                 ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-04-26 13:37 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development


On 26/04/2017 13:23, Zdenek Kabelac wrote:
>
> You need to use 'direct' write more - otherwise you are just witnessing
> issues related with 'page-cache' flushing.
>
> Every update of file means update of journal - so you surely can lose
> some data in-flight - but every good software needs to the flush before
> doing next transaction - so with correctly working transaction software
> no data could be lost.

I used "oflag=sync" for this very reason - to avoid async writes, 
However, let's retry with "oflat=direct,sync".

This is the thinpool before filling:

[root@blackhole mnt]# lvs
   LV       VG        Attr       LSize  Pool     Origin Data%  Meta% 
Move Log Cpy%Sync Convert
   thinpool vg_kvm    twi-aot---  1.00g                 87.66  12.01 

   thinvol  vg_kvm    Vwi-aot---  2.00g thinpool        43.83 

   root     vg_system -wi-ao---- 50.00g 

   swap     vg_system -wi-ao----  7.62g

[root@blackhole storage]# mount | grep thinvol
/dev/mapper/vg_kvm-thinvol on /mnt/storage type ext4 
(rw,relatime,seclabel,errors=remount-ro,stripe=32,data=ordered)


Fill the thin volume (note that errors are raised immediately due to 
--errorwhenfull=y):

[root@blackhole mnt]# dd if=/dev/zero of=/mnt/storage/test.2 bs=1M 
count=300 oflag=direct,sync
dd: error writing ‘/mnt/storage/test.2’: Input/output error
127+0 records in
126+0 records out
132120576 bytes (132 MB) copied, 14.2165 s, 9.3 MB/s

 From syslog:

Apr 26 15:26:24 localhost lvm[897]: WARNING: Thin pool 
vg_kvm-thinpool-tpool data is now 96.84% full.
Apr 26 15:26:27 localhost kernel: device-mapper: thin: 253:4: reached 
low water mark for data device: sending event.
Apr 26 15:26:27 localhost kernel: device-mapper: thin: 253:4: switching 
pool to out-of-data-space (error IO) mode
Apr 26 15:26:34 localhost lvm[897]: WARNING: Thin pool 
vg_kvm-thinpool-tpool data is now 100.00% full.

Despite write errors, the filesystem is not in read-only mode:

[root@blackhole mnt]#  touch /mnt/storage/test.txt; sync; ls -al 
/mnt/storage
total 948248
drwxr-xr-x. 3 root root      4096 26 apr 15.27 .
drwxr-xr-x. 6 root root        51 20 apr 15.23 ..
drwx------. 2 root root     16384 26 apr 15.24 lost+found
-rw-r--r--. 1 root root 838860800 26 apr 15.25 test.1
-rw-r--r--. 1 root root 132120576 26 apr 15.26 test.2
-rw-r--r--. 1 root root         0 26 apr 15.27 test.txt

I can even recover free space via fstrim:

[root@blackhole mnt]# rm /mnt/storage/test.1; sync
rm: remove regular file ‘/mnt/storage/test.1’? y
[root@blackhole mnt]# fstrim -v /mnt/storage/
/mnt/storage/: 828 MiB (868204544 bytes) trimmed
[root@blackhole mnt]# lvs
   LV       VG        Attr       LSize  Pool     Origin Data%  Meta% 
Move Log Cpy%Sync Convert
   thinpool vg_kvm    twi-aot---  1.00g                 21.83  3.71
   thinvol  vg_kvm    Vwi-aot---  2.00g thinpool        10.92
   root     vg_system -wi-ao---- 50.00g
   swap     vg_system -wi-ao----  7.62g

 From syslog:
Apr 26 15:34:15 localhost kernel: device-mapper: thin: 253:4: switching 
pool to write mode

To me, it seems that metadata updates completed because they hit the 
already-allocated disk space, not triggering the remount-ro code. I am 
missing something?

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26 13:37                               ` Gionatan Danti
@ 2017-04-26 14:33                                 ` Zdenek Kabelac
  2017-04-26 16:37                                   ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-04-26 14:33 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

Dne 26.4.2017 v 15:37 Gionatan Danti napsal(a):
> 
> On 26/04/2017 13:23, Zdenek Kabelac wrote:
>>
>> You need to use 'direct' write more - otherwise you are just witnessing
>> issues related with 'page-cache' flushing.
>>
>> Every update of file means update of journal - so you surely can lose
>> some data in-flight - but every good software needs to the flush before
>> doing next transaction - so with correctly working transaction software
>> no data could be lost.
> 
> I used "oflag=sync" for this very reason - to avoid async writes, However, 
> let's retry with "oflat=direct,sync".
> 
> This is the thinpool before filling:
> 
> [root@blackhole mnt]# lvs
>    LV       VG        Attr       LSize  Pool     Origin Data%  Meta% Move Log 
> Cpy%Sync Convert
>    thinpool vg_kvm    twi-aot---  1.00g                 87.66  12.01
>    thinvol  vg_kvm    Vwi-aot---  2.00g thinpool        43.83
>    root     vg_system -wi-ao---- 50.00g
>    swap     vg_system -wi-ao----  7.62g
> 
> [root@blackhole storage]# mount | grep thinvol
> /dev/mapper/vg_kvm-thinvol on /mnt/storage type ext4 
> (rw,relatime,seclabel,errors=remount-ro,stripe=32,data=ordered)
> 
> 
> Fill the thin volume (note that errors are raised immediately due to 
> --errorwhenfull=y):
> 
> [root@blackhole mnt]# dd if=/dev/zero of=/mnt/storage/test.2 bs=1M count=300 
> oflag=direct,sync
> dd: error writing ‘/mnt/storage/test.2’: Input/output error
> 127+0 records in
> 126+0 records out
> 132120576 bytes (132 MB) copied, 14.2165 s, 9.3 MB/s
> 
>  From syslog:
> 
> Apr 26 15:26:24 localhost lvm[897]: WARNING: Thin pool vg_kvm-thinpool-tpool 
> data is now 96.84% full.
> Apr 26 15:26:27 localhost kernel: device-mapper: thin: 253:4: reached low 
> water mark for data device: sending event.
> Apr 26 15:26:27 localhost kernel: device-mapper: thin: 253:4: switching pool 
> to out-of-data-space (error IO) mode
> Apr 26 15:26:34 localhost lvm[897]: WARNING: Thin pool vg_kvm-thinpool-tpool 
> data is now 100.00% full.
> 
> Despite write errors, the filesystem is not in read-only mode:


But you get correct 'write' error - so from application POV - you get failing
transaction update/write - so app knows  'data' were lost and should not 
proceed with next transaction - so it's in line with  'no data is lost' and 
filesystem is not damaged and is in correct state (mountable).


Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26 14:33                                 ` Zdenek Kabelac
@ 2017-04-26 16:37                                   ` Gionatan Danti
  2017-04-26 18:32                                     ` Stuart Gathman
                                                       ` (2 more replies)
  0 siblings, 3 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-04-26 16:37 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

Il 26-04-2017 16:33 Zdenek Kabelac ha scritto:
> But you get correct 'write' error - so from application POV - you get 
> failing
> transaction update/write - so app knows  'data' were lost and should
> not proceed with next transaction - so it's in line with  'no data is
> lost' and filesystem is not damaged and is in correct state
> (mountable).

True, but the case exists that, even on a full pool, an application with 
multiple outstanding writes will have some of them completed/commited 
while other get I/O error, as writes to already allocated space are 
permitted while writes to non-allocated space are failed. If, for 
example, I overwrite some already-allocated files, writes will be 
committed even if the pool is completely full.

In past discussion, I had the impression that the only filesystem you 
feel safe with thinpool is ext4 + remount-ro, on the assumption that 
*any* failed writes will trigger the read-only mode. But from my test it 
seems that only *failed metadata updates* trigger the read-only mode. If 
this is really the case, remount-ro really is a mandatory option. 
However, as metadata can reside on alredy-allocated blocks, even of a 
full pool they have a chance to be committed, without triggering the 
remount-ro.

At the same time, I thought that you consider the thinpool + xfs combo 
somewhat "risky", as xfs does not have a remount-ro option. Actually, 
xfs seems to *always* shutdown the filesystem in case of failed metadata 
update.

Maybe I misunderstood some yours message; in this case, sorry for that.

Anyway, I think (and maybe I am wrong...) that the better solution is to 
fail *all* writes to a full pool, even the ones directed to allocated 
space. This will effectively "freeze" the pool and avoid any 
long-standing inconsistencies.

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26 16:37                                   ` Gionatan Danti
@ 2017-04-26 18:32                                     ` Stuart Gathman
  2017-04-26 19:24                                     ` Stuart Gathman
  2017-05-02 11:00                                     ` Gionatan Danti
  2 siblings, 0 replies; 94+ messages in thread
From: Stuart Gathman @ 2017-04-26 18:32 UTC (permalink / raw)
  To: linux-lvm

On 04/26/2017 12:37 PM, Gionatan Danti wrote:
>
> Anyway, I think (and maybe I am wrong...) that the better solution is
> to fail *all* writes to a full pool, even the ones directed to
> allocated space. This will effectively "freeze" the pool and avoid any
> long-standing inconsistencies.
+1  This is what I have been advocating also

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26 16:37                                   ` Gionatan Danti
  2017-04-26 18:32                                     ` Stuart Gathman
@ 2017-04-26 19:24                                     ` Stuart Gathman
  2017-05-02 11:00                                     ` Gionatan Danti
  2 siblings, 0 replies; 94+ messages in thread
From: Stuart Gathman @ 2017-04-26 19:24 UTC (permalink / raw)
  To: linux-lvm

On 04/26/2017 12:37 PM, Gionatan Danti wrote:
>
> Anyway, I think (and maybe I am wrong...) that the better solution is
> to fail *all* writes to a full pool, even the ones directed to
> allocated space. This will effectively "freeze" the pool and avoid any
> long-standing inconsistencies. 
Or slightly better: fail *all* writes to a full pool after the *first*
write to an unallocated area.  That way, operation can continue a little
longer without risking inconsistency so long as all writes are to
allocated areas. 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-26 16:37                                   ` Gionatan Danti
  2017-04-26 18:32                                     ` Stuart Gathman
  2017-04-26 19:24                                     ` Stuart Gathman
@ 2017-05-02 11:00                                     ` Gionatan Danti
  2017-05-12 13:02                                       ` Gionatan Danti
  2 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-05-02 11:00 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development



On 26/04/2017 18:37, Gionatan Danti wrote:
> True, but the case exists that, even on a full pool, an application with
> multiple outstanding writes will have some of them completed/commited
> while other get I/O error, as writes to already allocated space are
> permitted while writes to non-allocated space are failed. If, for
> example, I overwrite some already-allocated files, writes will be
> committed even if the pool is completely full.
>
> In past discussion, I had the impression that the only filesystem you
> feel safe with thinpool is ext4 + remount-ro, on the assumption that
> *any* failed writes will trigger the read-only mode. But from my test it
> seems that only *failed metadata updates* trigger the read-only mode. If
> this is really the case, remount-ro really is a mandatory option.
> However, as metadata can reside on alredy-allocated blocks, even of a
> full pool they have a chance to be committed, without triggering the
> remount-ro.
>
> At the same time, I thought that you consider the thinpool + xfs combo
> somewhat "risky", as xfs does not have a remount-ro option. Actually,
> xfs seems to *always* shutdown the filesystem in case of failed metadata
> update.
>
> Maybe I misunderstood some yours message; in this case, sorry for that.
>
> Anyway, I think (and maybe I am wrong...) that the better solution is to
> fail *all* writes to a full pool, even the ones directed to allocated
> space. This will effectively "freeze" the pool and avoid any
> long-standing inconsistencies.
>
> Thanks.
>

Hi Zdeneck, I would *really* to hear back you on these questions.
Can we consider thinlvm + xfs as safe as thinlvm + ext4 ?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-02 11:00                                     ` Gionatan Danti
@ 2017-05-12 13:02                                       ` Gionatan Danti
  2017-05-12 13:42                                         ` Joe Thornber
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-05-12 13:02 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

On 02/05/2017 13:00, Gionatan Danti wrote:
>
>
> On 26/04/2017 18:37, Gionatan Danti wrote:
>> True, but the case exists that, even on a full pool, an application with
>> multiple outstanding writes will have some of them completed/commited
>> while other get I/O error, as writes to already allocated space are
>> permitted while writes to non-allocated space are failed. If, for
>> example, I overwrite some already-allocated files, writes will be
>> committed even if the pool is completely full.
>>
>> In past discussion, I had the impression that the only filesystem you
>> feel safe with thinpool is ext4 + remount-ro, on the assumption that
>> *any* failed writes will trigger the read-only mode. But from my test it
>> seems that only *failed metadata updates* trigger the read-only mode. If
>> this is really the case, remount-ro really is a mandatory option.
>> However, as metadata can reside on alredy-allocated blocks, even of a
>> full pool they have a chance to be committed, without triggering the
>> remount-ro.
>>
>> At the same time, I thought that you consider the thinpool + xfs combo
>> somewhat "risky", as xfs does not have a remount-ro option. Actually,
>> xfs seems to *always* shutdown the filesystem in case of failed metadata
>> update.
>>
>> Maybe I misunderstood some yours message; in this case, sorry for that.
>>
>> Anyway, I think (and maybe I am wrong...) that the better solution is to
>> fail *all* writes to a full pool, even the ones directed to allocated
>> space. This will effectively "freeze" the pool and avoid any
>> long-standing inconsistencies.
>>
>> Thanks.
>>
>
> Hi Zdeneck, I would *really* to hear back you on these questions.
> Can we consider thinlvm + xfs as safe as thinlvm + ext4 ?
>
> Thanks.
>

Hi all and sorry for the bump...
Anyone with some comments on these questions?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-12 13:02                                       ` Gionatan Danti
@ 2017-05-12 13:42                                         ` Joe Thornber
  2017-05-14 20:39                                           ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Joe Thornber @ 2017-05-12 13:42 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Zdenek Kabelac

On Fri, May 12, 2017 at 03:02:58PM +0200, Gionatan Danti wrote:
> On 02/05/2017 13:00, Gionatan Danti wrote:
> >>Anyway, I think (and maybe I am wrong...) that the better solution is to
> >>fail *all* writes to a full pool, even the ones directed to allocated
> >>space. This will effectively "freeze" the pool and avoid any
> >>long-standing inconsistencies.

I think dm-thin behaviour is fine given the semantics of write
and flush IOs.

A block device can complete a write even if it hasn't hit the physical
media, a flush request needs to come in at a later time which means
'flush all IOs that you've previously completed'.  So any software using
a block device (fs, database etc), tends to generate batches of writes,
followed by a flush to commit the changes.  For example if there was a
power failure between the batch of write io completing and the flush
completing you do not know how much of the writes will be visible when
the machine comes back.

When a pool is full it will allow writes to provisioned areas of a thin to
succeed.  But if any writes failed due to inability to provision then a
REQ_FLUSH io to that thin device will *not* succeed.

- Joe

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-12 13:42                                         ` Joe Thornber
@ 2017-05-14 20:39                                           ` Gionatan Danti
  2017-05-15 12:50                                             ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-05-14 20:39 UTC (permalink / raw)
  To: LVM general discussion and development, Zdenek Kabelac; +Cc: Joe Thornber

Il 12-05-2017 15:42 Joe Thornber ha scritto:
> On Fri, May 12, 2017 at 03:02:58PM +0200, Gionatan Danti wrote:
>> On 02/05/2017 13:00, Gionatan Danti wrote:
>> >>Anyway, I think (and maybe I am wrong...) that the better solution is to
>> >>fail *all* writes to a full pool, even the ones directed to allocated
>> >>space. This will effectively "freeze" the pool and avoid any
>> >>long-standing inconsistencies.
> 
> I think dm-thin behaviour is fine given the semantics of write
> and flush IOs.
> 
> A block device can complete a write even if it hasn't hit the physical
> media, a flush request needs to come in at a later time which means
> 'flush all IOs that you've previously completed'.  So any software 
> using
> a block device (fs, database etc), tends to generate batches of writes,
> followed by a flush to commit the changes.  For example if there was a
> power failure between the batch of write io completing and the flush
> completing you do not know how much of the writes will be visible when
> the machine comes back.
> 
> When a pool is full it will allow writes to provisioned areas of a thin 
> to
> succeed.  But if any writes failed due to inability to provision then a
> REQ_FLUSH io to that thin device will *not* succeed.
> 
> - Joe

True, but the real problem is that most of the failed flushes will *not* 
bring the filesystem read-only, as both ext4 and xfs seems to go 
read-only only when *metadata* updates fail. As this very same list 
recommend using ext4 with errors=remount-ro on the basis that putting 
the filesystem in a read-only state after any error I the right thing, I 
was somewhat alarmed to find that, as far I can tell, ext4 goes 
read-only on metadata errors only.

So, let me reiterate: can we consider thinlvm + xfs as safe as thinlvm + 
ext4 + errors=remount-ro?
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-14 20:39                                           ` Gionatan Danti
@ 2017-05-15 12:50                                             ` Zdenek Kabelac
  2017-05-15 14:48                                               ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-05-15 12:50 UTC (permalink / raw)
  To: linux-lvm, Gionatan Danti

Dne 14.5.2017 v 22:39 Gionatan Danti napsal(a):
> Il 12-05-2017 15:42 Joe Thornber ha scritto:
>> On Fri, May 12, 2017 at 03:02:58PM +0200, Gionatan Danti wrote:
>>> On 02/05/2017 13:00, Gionatan Danti wrote:
>>> >>Anyway, I think (and maybe I am wrong...) that the better solution is to
>>> >>fail *all* writes to a full pool, even the ones directed to allocated
>>> >>space. This will effectively "freeze" the pool and avoid any
>>> >>long-standing inconsistencies.
>>
>> I think dm-thin behaviour is fine given the semantics of write
>> and flush IOs.
>>
>> A block device can complete a write even if it hasn't hit the physical
>> media, a flush request needs to come in at a later time which means
>> 'flush all IOs that you've previously completed'.  So any software using
>> a block device (fs, database etc), tends to generate batches of writes,
>> followed by a flush to commit the changes.  For example if there was a
>> power failure between the batch of write io completing and the flush
>> completing you do not know how much of the writes will be visible when
>> the machine comes back.
>>
>> When a pool is full it will allow writes to provisioned areas of a thin to
>> succeed.  But if any writes failed due to inability to provision then a
>> REQ_FLUSH io to that thin device will *not* succeed.
>>
>> - Joe
> 
> True, but the real problem is that most of the failed flushes will *not* bring 
> the filesystem read-only, as both ext4 and xfs seems to go read-only only when 
> *metadata* updates fail. As this very same list recommend using ext4 with 
> errors=remount-ro on the basis that putting the filesystem in a read-only 
> state after any error I the right thing, I was somewhat alarmed to find that, 
> as far I can tell, ext4 goes read-only on metadata errors only.
> 
> So, let me reiterate: can we consider thinlvm + xfs as safe as thinlvm + ext4 
> + errors=remount-ro?


Hi

I still think you are mixing apples & oranges together and you expecting 
answer '42' :)

There is simply NO simple answer. Every case has its pros & cons.

There is simply cases where XFS beats Ext4 and there are opposite situations 
as well.

Also you WILL always get WRITE error - if your application doesn't care about 
write error - why do you expect any block-device logic could rescue you ??

Out-of-space thin-pool is simply a device which looks like seriously damaged 
disk where you always read something without any problem and you fail to write 
things here and there.

IMHO both filesystem XFS & Ext4 on recent kernels do work well - but no one 
can say there are no problems at all.

Things are getting better -  but  planning  usage of thin-pool to 'recover' 
overfilled pool is simple BAD planning. You should plan your thin-pool usage 
to NOT run out-of-space.

And last comment I always say -  full thin-pool it not  similar to full 
filesystem where you drop some 'large' file and you are happily working again 
- it's not working this way - and if someone hoped into this - he needs to use 
something else ATM.


Regards


Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-15 12:50                                             ` Zdenek Kabelac
@ 2017-05-15 14:48                                               ` Gionatan Danti
  2017-05-15 15:33                                                 ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-05-15 14:48 UTC (permalink / raw)
  To: Zdenek Kabelac, linux-lvm

On 15/05/2017 14:50, Zdenek Kabelac wrote> Hi
> 
> I still think you are mixing apples & oranges together and you expecting 
> answer '42' :)

'42' would be the optimal answer :p

> There is simply NO simple answer. Every case has its pros & cons.
> 
> There is simply cases where XFS beats Ext4 and there are opposite 
> situations as well.

Maybe I'm too naive, but I have an hard time grasping all the 
implication of this sentence.

I fully understand that, currently, a full thinp is basically a "damaged 
disk", where some writes can complete (good/provisioned zones) and some 
fail (bad/unprovisioned zones). I also read the device-mapper docs and I 
understand that, currently, a "fail all writes but let reads succeed" 
target does not exists.

What I does not understand is how XFS and EXT4 differs when a thinp is 
full. From a previous your reply, after I asked how to put thinp in read 
only mode when full:

"Using 'ext4' with remount-ro is fairly easy to setup and get exactly 
this logic."

My naive interpretation is that when EXT4 detects *any* I/O error, it 
will set the filesystem in read-only mode. Except that my tests show 
that only failed *metadata* update put the filesystem in this state. The 
bad thingh is that, when not using "remount-ro", even failed metadata 
updates will *not* trigger any read-only response.

In short, I am right saying that EXT4 should *always* be used with 
"remount-ro" when stacked on top of a thinp?

On the other hand, XFS has not such options but it, by default, ensures 
that failed *metadata* updates will stop the filesystem. Even reads are 
not allowed (to regain read access, you need to repair the filesystem or 
mount it with "ro,norecovery").

So, it should be even safer than EXT4, right? Or do you feel that is the 
other way around? If so, why?

> Things are getting better -  but  planning  usage of thin-pool to 
> 'recover' overfilled pool is simple BAD planning. You should plan your 
> thin-pool usage to NOT run out-of-space.

Sure, and I am *not* planning for it. But as bad things always happen, 
I'm preparing for them ;)

> And last comment I always say -  full thin-pool it not  similar to full 
> filesystem where you drop some 'large' file and you are happily working 
> again - it's not working this way - and if someone hoped into this - he 
> needs to use something else ATM.

Absolutely.

Sorry if I seem pedantic, I am genuinely try to understand.
Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-15 14:48                                               ` Gionatan Danti
@ 2017-05-15 15:33                                                 ` Zdenek Kabelac
  2017-05-16  7:53                                                   ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-05-15 15:33 UTC (permalink / raw)
  To: Gionatan Danti, linux-lvm

Dne 15.5.2017 v 16:48 Gionatan Danti napsal(a):
> On 15/05/2017 14:50, Zdenek Kabelac wrote> Hi
>>
> What I does not understand is how XFS and EXT4 differs when a thinp is full. 
>  From a previous your reply, after I asked how to put thinp in read only mode 
> when full:
> 
> "Using 'ext4' with remount-ro is fairly easy to setup and get exactly this 
> logic."
> 
> My naive interpretation is that when EXT4 detects *any* I/O error, it will set 
> the filesystem in read-only mode. Except that my tests show that only failed 
> *metadata* update put the filesystem in this state. The bad thingh is that, 
> when not using "remount-ro", even failed metadata updates will *not* trigger 
> any read-only response.


Ever tested this:

mount -o errors=remount-ro,data=journal ?

Everything has it's price - you want to have also 'safe' data - well you have 
to pay the price.


> On the other hand, XFS has not such options but it, by default, ensures that 
> failed *metadata* updates will stop the filesystem. Even reads are not allowed 
> (to regain read access, you need to repair the filesystem or mount it with 
> "ro,norecovery").
> 
> So, it should be even safer than EXT4, right? Or do you feel that is the other 
> way around? If so, why?

I prefer 'remount-ro'  as the FS is still at least accessible/usable in some way.



> 
>> Things are getting better -  but  planning  usage of thin-pool to 'recover' 
>> overfilled pool is simple BAD planning. You should plan your thin-pool usage 
>> to NOT run out-of-space.
> 
> Sure, and I am *not* planning for it. But as bad things always happen, I'm 
> preparing for them ;)

When you have extra space you can add for recovery - it's usually easy.
But you will have much harder time doing recovery without extra space.

So again - all has its price....

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-15 15:33                                                 ` Zdenek Kabelac
@ 2017-05-16  7:53                                                   ` Gionatan Danti
  2017-05-16 10:54                                                     ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2017-05-16  7:53 UTC (permalink / raw)
  To: Zdenek Kabelac, linux-lvm

On 15/05/2017 17:33, Zdenek Kabelac wrote:> Ever tested this:
> 
> mount -o errors=remount-ro,data=journal ?

Yes, I tested it - same behavior: a full thinpool does *not* immediately 
put the filesystem in a read-only state, even when using sync/fsync and 
"errorwhenfull=y".

So, it seems EXT4 remounts in read-only mode only when *metadata* 
updates fail.

> I prefer 'remount-ro'  as the FS is still at least accessible/usable in 
> some way.

Fair enought.

>>> Things are getting better

Can you make an example?

> So again - all has its price....

True ;)

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-16  7:53                                                   ` Gionatan Danti
@ 2017-05-16 10:54                                                     ` Zdenek Kabelac
  2017-05-16 13:38                                                       ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2017-05-16 10:54 UTC (permalink / raw)
  To: LVM general discussion and development, Gionatan Danti

Dne 16.5.2017 v 09:53 Gionatan Danti napsal(a):
> On 15/05/2017 17:33, Zdenek Kabelac wrote:> Ever tested this:
>>
>> mount -o errors=remount-ro,data=journal ?
> 
> Yes, I tested it - same behavior: a full thinpool does *not* immediately put 
> the filesystem in a read-only state, even when using sync/fsync and 
> "errorwhenfull=y".

Hi

Somehow I think you've rather made a mistake during your test (or you have 
buggy kernel). Can you take full log of your test  show all options are
properly applied

i.e. dmesg log +  /proc/self/mountinfo report showing all options used for 
mountpoint and kernel version in use.

IMHO you should get something like this in dmesg once your pool gets out of 
space and starts to return error on write:

----
Aborting journal on device dm-4-8.
EXT4-fs error (device dm-4): ext4_journal_check_start:60: Detected aborted journal
EXT4-fs (dm-4): Remounting filesystem read-only
----


Clearly when you specify 'data=journal'  even write failure of data will cause 
journal error and thus remount-ro reaction (it least on my box does it) - but 
such usage is noticeable slower compared with 'ordered' mode.


Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-05-16 10:54                                                     ` Zdenek Kabelac
@ 2017-05-16 13:38                                                       ` Gionatan Danti
  0 siblings, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2017-05-16 13:38 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development

On 16/05/2017 12:54, Zdenek Kabelac wrote:
> 
> Hi
> 
> Somehow I think you've rather made a mistake during your test (or you 
> have buggy kernel). Can you take full log of your test  show all options 
> are
> properly applied
> 
> i.e. dmesg log +  /proc/self/mountinfo report showing all options used 
> for mountpoint and kernel version in use.
> 
> IMHO you should get something like this in dmesg once your pool gets out 
> of space and starts to return error on write:
> 
> ----
> Aborting journal on device dm-4-8.
> EXT4-fs error (device dm-4): ext4_journal_check_start:60: Detected 
> aborted journal
> EXT4-fs (dm-4): Remounting filesystem read-only
> ----
> 
> 
> Clearly when you specify 'data=journal'  even write failure of data will 
> cause journal error and thus remount-ro reaction (it least on my box 
> does it) - but such usage is noticeable slower compared with 'ordered' 
> mode.

Zdenek, you are right: re-executing the test, I now see the following 
dmesg entries:

[ 1873.677882] Aborting journal on device dm-6-8.
[ 1873.757170] EXT4-fs error (device dm-6): ext4_journal_check_start:56: 
Detected aborted journal
[ 1873.757184] EXT4-fs (dm-6): Remounting filesystem read-only

At the same time, looking at bash history and /var/log/messages it 
*seems* that I did nothing wrong with previous tests. I'll do more tests 
and post here if I find something relevant.

Thanks for your time and patience.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2017-04-24 21:59                     ` Zdenek Kabelac
  2017-04-26  7:26                       ` Gionatan Danti
@ 2018-02-27 18:39                       ` Xen
  2018-02-28  9:26                         ` Zdenek Kabelac
  1 sibling, 1 reply; 94+ messages in thread
From: Xen @ 2018-02-27 18:39 UTC (permalink / raw)
  To: linux-lvm

Zdenek Kabelac schreef op 24-04-2017 23:59:

>>> I'm just currious -  what the you think will happen when you have
>>> root_LV as thin LV and thin pool runs out of space - so 'root_LV'
>>> is replaced with 'error' target.
>> 
>> Why do you suppose Root LV is on thin?
>> 
>> Why not just stick to the common scenario when thin is used for extra 
>> volumes or data?
>> 
>> I mean to say that you are raising an exceptional situation as an 
>> argument against something that I would consider quite common, which 
>> doesn't quite work that way: you can't prove that most people would 
>> not want something by raising something most people wouldn't use.
>> 
>> I mean to say let's just look at the most common denominator here.
>> 
>> Root LV on thin is not that.
> 
> Well then you might be surprised - there are user using exactly this.

I am sorry, this is a long time ago.

I was concerned with thin full behaviour and I guess I was concerned 
with being able to limit thin snapshot sizes.

I said that application failure was acceptable, but system failure not.

Then you brought up root on thin as a way of "upping the ante".

I contended that this is a bigger problem to tackle, but it shouldn't 
mean you shouldn't tackle the smaller problems.

(The smaller problem being data volumes).

Even if root is on thin and you are using it for snapshotting, it would 
be extremely unwise to overprovision such a thing or to depend on 
"additional space" being added by the admin; root filesystems are not 
meant to be expandable.

If on the other hand you do count on overprovisioning (due to snapshots) 
then being able to limit snapshot size becomes even more important.

> When you have rootLV on thinLV - you could easily snapshot it before
> doing any upgrade and revert back in case something fails on upgrade.
> See also projects like snapper...

True enough, but if you risk filling your pool because you don't have 
full room for a full snapshot, that would be extremely unwise. I'm also 
not sure write performance for a single snapshot is very much different 
between thin and non-thin?

They are both CoW. E.g. you write to an existing block it has to be 
duplicated, only for non-allocated writes thin is faster, right?

I simply cannot reconcile an attitude that thin-full-risk is acceptable 
and the admin's job while at the same time advocating it for root 
filesystems.

Now most of this thread I was under the impression that "SYSTEM HANGS" 
where the norm because that's the only thing I ever experienced (kernel 
3.x and kernel 4.4 back then), however you said that this was fixed in 
later kernels.

So given that, some of the disagreement here was void as apparently no 
one advocated that these hangs were acceptable ;-).

:).


>> I have tried it, yes. Gives troubles with Grub and requires thin 
>> package to be installed on all systems and makes it harder to install 
>> a system too.
> 
> lvm2 is cooking some better boot support atm....

Grub-probe couldn't find the root volume so I had to maintain my own 
grub.cfg.

Regardless if I ever used this again I would take care to never 
overprovision or to only overprovision at low risk with respect to 
snapshots.

Ie. you could thin provision root + var or something similar but I would 
always put data volumes (home etc) elsewhere.

Ie. not share the same pool.

Currently I was using a regular snapshot but I allocated it too small 
and it always got dropped much faster than I anticipated.

(A 1GB snapshot constantly filling up with even minor upgrade 
operations).



>> Thin root LV is not the idea for most people.
>> 
>> So again, don't you think having data volumes produce errors is not 
>> preferable to having the entire system hang?
> 
> Not sure why you insist system hangs.
> 
> If system hangs - and you have recent kernel & lvm2 - you should fill 
> bug.
> 
> If you set  '--errorwhenfull y'  - it should instantly fail.
> 
> There should not be any hanging..

Right well Debian Jessie and Ubuntu Xenial just experienced that.



>> That's irrelevant; if the thin pool is full you need to mitigate it, 
>> rebooting won't help with that.
> 
> well it's really admins task to solve the problem after panic call.
> (adding new space).

That's a lot easier if your root filesystem doesn't lock up.

;-).

Good luck booting to some rescue environment on a VPS or with some boot 
stick on a PC; the Ubuntu rescue environment for instance has been 
abysmal since SystemD.

You can't actually use the rescue environment because there is some 
weird interaction with systemd spewing messages and causing weird 
behaviour on the TTY you are supposed to work on.

Initrd yes, but not the "full rescue" systemd target, doesn't work.

My point with this thread was.....




When my root snapshot fills up and gets dropped, I lose my undo history, 
but at least my root filesystem won't lock up.

I just calculated the size too small and I am sure I can also put a 
snapshot IN a thin pool for a non-thin root volume?

Haven't tried.

However, I don't have the space for a full copy of every filesystem, so 
if I snapshot, I will automatically overprovision.

My snapshots are indeed meant for backups (of data volumes) ---- not for 
rollback ----- and for rollback ----- but only for the root filesystem.

So: my thin snapshots are meant for backup,
     my root snapshot (non-thin) is meant for rollback.

But, if any application really misbehaved... previously the entire 
system would crash (kernel 3.x).

So, the only defense is constant monitoring and emails or even tty/pty 
broadcasts because
well sometimes it is just human error where you copy the wrong thing to 
the wrong place.

Because I cannot limit my (backup) snapshots in size.

With sufficient monitoring I guess that is not much of an issue.

> Thin users can't expect to overload system in crazy way and expect the
> system will easily do something magical to restore all data.

That was never asked.

My problem was system hangs, but my question was about limiting snapshot 
size on thin.

However userspace response scripts were obviously possible.....

Including those that would prioritize dropping thin snapshots over other 
measures.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-27 18:39                       ` Xen
@ 2018-02-28  9:26                         ` Zdenek Kabelac
  2018-02-28 19:07                           ` Gionatan Danti
  2018-03-03 17:52                           ` Xen
  0 siblings, 2 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2018-02-28  9:26 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 27.2.2018 v 19:39 Xen napsal(a):
> Zdenek Kabelac schreef op 24-04-2017 23:59:
> 
>>>> I'm just currious -� what the you think will happen when you have
>>>> root_LV as thin LV and thin pool runs out of space - so 'root_LV'
>>>> is replaced with 'error' target.
>>>
>>> Why do you suppose Root LV is on thin?
>>>
>>> Why not just stick to the common scenario when thin is used for extra 
>>> volumes or data?
>>>
>>> I mean to say that you are raising an exceptional situation as an argument 
>>> against something that I would consider quite common, which doesn't quite 
>>> work that way: you can't prove that most people would not want something by 
>>> raising something most people wouldn't use.
>>>
>>> I mean to say let's just look at the most common denominator here.
>>>
>>> Root LV on thin is not that.
>>
>> Well then you might be surprised - there are user using exactly this.
> 
> I am sorry, this is a long time ago.
> 
> I was concerned with thin full behaviour and I guess I was concerned with 
> being able to limit thin snapshot sizes.
> 
> I said that application failure was acceptable, but system failure not.

Hi

I'll probably repeat my self again, but thin provision can't be responsible 
for all kernel failures. There is no way DM team can fix all the related paths 
on this road.

If you don't plan to help resolving those issue - there is not point in 
complaining over and over again - we are already well aware of this issues...

Admin needs to be aware of 'pros & cons'  and have to use thin technology at 
the right place for the right task.

If the admin can't stand failing system, he can't use thin-p.

Overprovisioning on DEVICE level simply IS NOT equivalent to full filesystem 
like you would like to see all the time here and you've been already many 
times explained that filesystems are simply not there ready - fixes are on 
going but it will take its time and it's really pointless to exercise this on 
2-3 year old kernels...

Thin provisioning has it's use case and it expects admin is well aware of 
possible problems.

If you are aiming for a magic box working always right - stay away from thin-p 
- the best advice....

> Even if root is on thin and you are using it for snapshotting, it would be 
> extremely unwise to overprovision such a thing or to depend on "additional 
> space" being added by the admin; root filesystems are not meant to be expandable.

Do NOT take thin snapshot of your root filesystem so you will avoid thin-pool 
overprovisioning problem.

> True enough, but if you risk filling your pool because you don't have full 
> room for a full snapshot, that would be extremely unwise. I'm also not sure 
> write performance for a single snapshot is very much different between thin 
> and non-thin?

Rule #1:

Thin-pool was never targeted for 'regular' usage of full thin-pool.
Full thin-pool is serious ERROR condition with bad/ill effects on systems.
Thin-pool was designed to 'delay/postpone' real space usage - aka you can use 
more 'virtual' space with the promise you deliver real storage later.

So if you have different goals - like having some kind of full equivalency 
logic to full filesystem - you need to write different target....


> I simply cannot reconcile an attitude that thin-full-risk is acceptable and 
> the admin's job while at the same time advocating it for root filesystems.

Do NOT use thin-provinioning - as it's not meeting your requirements.

> Now most of this thread I was under the impression that "SYSTEM HANGS" where 
> the norm because that's the only thing I ever experienced (kernel 3.x and 
> kernel 4.4 back then), however you said that this was fixed in later kernels.

Big news -  we are at ~4.16 kernel upstream - so noone is really taking much 
care about 4.4 troubles here - sorry about that....

Speaking of 4.4 - I'd generally advice to jump to higher versions of kernel 
ASAP - since  4.4 has some known bad behavior in the case thin-pool 'metadata' 
get overfilled.


>> lvm2 is cooking some better boot support atm....
> 
> Grub-probe couldn't find the root volume so I had to maintain my own grub.cfg.

There is on going 'BOOM' project - check it out please....


>> There should not be any hanging..
> 
> Right well Debian Jessie and Ubuntu Xenial just experienced that.

There is not much point in commenting support for some old distros other then 
you really should try harder with your distro maintainers....

>>> That's irrelevant; if the thin pool is full you need to mitigate it, 
>>> rebooting won't help with that.
>>
>> well it's really admins task to solve the problem after panic call.
>> (adding new space).
> 
> That's a lot easier if your root filesystem doesn't lock up.

- this is not really a fault of dm thin-provisioning kernel part.
- on going fixes to file systems are being pushed upstream (for years).
- fixes will not appear in years old kernels as such patches are usually 
invasive so unless you use pay someone to do the backporting job the easiest 
way forward is to user newer improved kernel..

> When my root snapshot fills up and gets dropped, I lose my undo history, but 
> at least my root filesystem won't lock up.

lvm2 fully support these snapshots as well as thin-snapshots.
Admin has to choose 'the best fit'

ATM  thin-pool can't deliver equivalent logic - just like old-snaps can't 
deliver thin-pool logic.

> However, I don't have the space for a full copy of every filesystem, so if I 
> snapshot, I will automatically overprovision.

Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
If you already have plan to never deliver promised space - you need to live 
with consequences....


> My snapshots are indeed meant for backups (of data volumes) ---- not for 
> rollback ----- and for rollback ----- but only for the root filesystem.

There is more fundamental problem here:

!SNAPSHOTS ARE NOT BACKUPS!

This is the key problem with your thinking here (unfortunately you are not 
'alone' with this thinking)


> With sufficient monitoring I guess that is not much of an issue.

We do provide quite good 'scripting' support for this case - but again if
the system can't crash - you can't use thin-pool for your root LV or you can't 
use over-provisioning.


> My problem was system hangs, but my question was about limiting snapshot size 
> on thin.

Well your problem primarily is usage of too old system....

Sorry to say this - but if you insist to stick with old system - ask your 
distro maintainers to do all the backporting work for you - this is nothing 
lvm2 can help with...


Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-28  9:26                         ` Zdenek Kabelac
@ 2018-02-28 19:07                           ` Gionatan Danti
  2018-02-28 21:43                             ` Zdenek Kabelac
  2018-03-03 18:17                             ` Xen
  2018-03-03 17:52                           ` Xen
  1 sibling, 2 replies; 94+ messages in thread
From: Gionatan Danti @ 2018-02-28 19:07 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen

Hi all,

Il 28-02-2018 10:26 Zdenek Kabelac ha scritto:
> Overprovisioning on DEVICE level simply IS NOT equivalent to full
> filesystem like you would like to see all the time here and you've
> been already many times explained that filesystems are simply not
> there ready - fixes are on going but it will take its time and it's
> really pointless to exercise this on 2-3 year old kernels...

this was really beaten to death in the past months/threads. I generally 
agree with Zedenk.

To recap (Zdeneck, correct me if I am wrong): the main problem is that, 
on a full pool, async writes will more-or-less silenty fail (with errors 
shown on dmesg, but nothing more). Another possible cause of problem is 
that, even on a full pool, *some* writes will complete correctly (the 
one on already allocated chunks).

In the past was argued that putting the entire pool in read-only mode 
(where *all* writes fail, but read are permitted to complete) would be a 
better fail-safe mechanism; however, it was stated that no current 
dmtarget permit that.

Two (good) solution where given, both relying on scripting (see 
"thin_command" option on lvm.conf):
- fsfreeze on a nearly full pool (ie: >=98%);
- replace the dmthinp target with the error target (using dmsetup).

I really think that with the good scripting infrastructure currently 
built in lvm this is a more-or-less solved problem.

> Do NOT take thin snapshot of your root filesystem so you will avoid
> thin-pool overprovisioning problem.

But is someone *really* pushing thinp for root filesystem? I always used 
it for data partition only... Sure, rollback capability on root is nice, 
but it is on data which they are *really* important.

> Thin-pool was never targeted for 'regular' usage of full thin-pool.
> Full thin-pool is serious ERROR condition with bad/ill effects on 
> systems.
> Thin-pool was designed to 'delay/postpone' real space usage - aka you
> can use more 'virtual' space with the promise you deliver real storage
> later.

In stress testing, I never saw a system crash on a full thin pool, but I 
was not using it on root filesystem. There are any ill effect on system 
stability which I need to know?

>> When my root snapshot fills up and gets dropped, I lose my undo 
>> history, but at least my root filesystem won't lock up.

We discussed that in the past also, but as snapshot volumes really are 
*regular*, writable volumes (which a 'k' flag to skip activation by 
default), the LVM team take the "safe" stance to not automatically drop 
any volume.

The solution is to use scripting/thin_command with lvm tags. For 
example:
- tag all snapshot with a "snap" tag;
- when usage is dangerously high, drop all volumes with "snap" tag.

>> However, I don't have the space for a full copy of every filesystem, 
>> so if I snapshot, I will automatically overprovision.
> 
> Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
> If you already have plan to never deliver promised space - you need to
> live with consequences....

I am not sure to 100% agree on that. Thinp is not only about "delaying" 
space provisioning; it clearly is also (mostly?) about fast, modern, 
usable snapshots. Docker, snapper, stratis, etc. all use thinp mainly 
for its fast, efficent snapshot capability. Denying that is not so 
useful and led to "overwarning" (ie: when snapshotting a volume on a 
virtually-fillable thin pool).

> 
> !SNAPSHOTS ARE NOT BACKUPS!
> 
> This is the key problem with your thinking here (unfortunately you are
> not 'alone' with this thinking)

Snapshot are not backups, as they do not protect from hardware problems 
(and denying that would be lame); however, they are an invaluable *part* 
of a successfull backup strategy. Having multiple rollaback target, even 
on the same machine, is a very usefull tool.

> We do provide quite good 'scripting' support for this case - but again 
> if
> the system can't crash - you can't use thin-pool for your root LV or
> you can't use over-provisioning.

Again, I don't understand by we are speaking about system crashes. On 
root *not* using thinp, I never saw a system crash due to full data 
pool.

Oh, and I use thinp on RHEL/CentOS only (Debian/Ubuntu backports are way 
too limited).

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-28 19:07                           ` Gionatan Danti
@ 2018-02-28 21:43                             ` Zdenek Kabelac
  2018-03-01  7:14                               ` Gionatan Danti
  2018-03-03 18:32                               ` Xen
  2018-03-03 18:17                             ` Xen
  1 sibling, 2 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2018-02-28 21:43 UTC (permalink / raw)
  To: LVM general discussion and development, Gionatan Danti; +Cc: Xen

Dne 28.2.2018 v 20:07 Gionatan Danti napsal(a):
> Hi all,
> 
> Il 28-02-2018 10:26 Zdenek Kabelac ha scritto:
>> Overprovisioning on DEVICE level simply IS NOT equivalent to full
>> filesystem like you would like to see all the time here and you've
>> been already many times explained that filesystems are simply not
>> there ready - fixes are on going but it will take its time and it's
>> really pointless to exercise this on 2-3 year old kernels...
> 
> this was really beaten to death in the past months/threads. I generally agree 
> with Zedenk.
> 
> To recap (Zdeneck, correct me if I am wrong): the main problem is that, on a 
> full pool, async writes will more-or-less silenty fail (with errors shown on 
> dmesg, but nothing more). Another possible cause of problem is that, even on a 
> full pool, *some* writes will complete correctly (the one on already allocated 
> chunks).

On default - full pool starts to 'error' all 'writes' in 60 seconds.

> 
> In the past was argued that putting the entire pool in read-only mode (where 
> *all* writes fail, but read are permitted to complete) would be a better 
> fail-safe mechanism; however, it was stated that no current dmtarget permit that.

Yep - I'd probably like to see slightly different mechanism - that all
on going writes would be failing  - so far - some 'writes' will pass
(those to already provisioned areas) - some will fail (those to unprovisioned).

The main problem is - after reboot - this 'missing/unprovisioned' space may 
provide some old data...

> 
> Two (good) solution where given, both relying on scripting (see "thin_command" 
> option on lvm.conf):
> - fsfreeze on a nearly full pool (ie: >=98%);
> - replace the dmthinp target with the error target (using dmsetup).

Yep - this all can happen via 'monitoring.
The key is to do it early before disaster happens.

> I really think that with the good scripting infrastructure currently built in 
> lvm this is a more-or-less solved problem.

It still depends - there is always some sort of 'race' - unless you are 
willing to 'give-up' too early to be always sure, considering there are 
technologies that may write many GB/s...

>> Do NOT take thin snapshot of your root filesystem so you will avoid
>> thin-pool overprovisioning problem.
> 
> But is someone *really* pushing thinp for root filesystem? I always used it 

You can use rootfs with thinp - it's very fast for testing i.e. upgrades
and quickly revert back - just there should be enough free space.

> In stress testing, I never saw a system crash on a full thin pool, but I was 
> not using it on root filesystem. There are any ill effect on system stability 
> which I need to know?

Depends on version of kernel and filesystem in use.

Note RHEL/Centos kernel has lots of backport even when it's look quite old.


> The solution is to use scripting/thin_command with lvm tags. For example:
> - tag all snapshot with a "snap" tag;
> - when usage is dangerously high, drop all volumes with "snap" tag.

Yep - every user has different plans in his mind - scripting gives user 
freedom to adapt this logic to local needs...

>>> However, I don't have the space for a full copy of every filesystem, so if 
>>> I snapshot, I will automatically overprovision.

As long as admin responsible controls space in thin-pool and takes action
long time before thin-pool runs out-of-space all is fine.

If admin hopes in some kind of magic to happen - we have a problem....


>>
>> Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
>> If you already have plan to never deliver promised space - you need to
>> live with consequences....
> 
> I am not sure to 100% agree on that. Thinp is not only about "delaying" space 
> provisioning; it clearly is also (mostly?) about fast, modern, usable 
> snapshots. Docker, snapper, stratis, etc. all use thinp mainly for its fast, 
> efficent snapshot capability. Denying that is not so useful and led to 
> "overwarning" (ie: when snapshotting a volume on a virtually-fillable thin pool).

Snapshot are using space - with hope that if you will 'really' need that space
you either add this space to you system - or you drop snapshots.

Still the same logic applied....

>> !SNAPSHOTS ARE NOT BACKUPS!
>>
>> This is the key problem with your thinking here (unfortunately you are
>> not 'alone' with this thinking)
> 
> Snapshot are not backups, as they do not protect from hardware problems (and 
> denying that would be lame); however, they are an invaluable *part* of a 
> successfull backup strategy. Having multiple rollaback target, even on the 
> same machine, is a very usefull tool.

Backups primarily sits on completely different storage.

If you keep backup of data in same pool:

1.)
error on this in single chunk shared by all your backup + origin - means it's 
total data loss - especially in case where filesystem are using 'BTrees' and 
some 'root node' is lost - can easily render you origin + all backups 
completely useless.

2.)
problems in thin-pool metadata can make all your origin+backups just an 
unordered mess of chunks.


> Again, I don't understand by we are speaking about system crashes. On root 
> *not* using thinp, I never saw a system crash due to full data pool. >
> Oh, and I use thinp on RHEL/CentOS only (Debian/Ubuntu backports are way too 
> limited).

Yep - this case is known to be pretty stable.

But as said - with today 'rush' of development and load of updates - user do 
want to try 'new disto upgrade' - if it works - all is fine - if it doesn't 
let's have a quick road back -  so using thin volume for rootfs is pretty 
wanted case.

Trouble is there is quite a lot of issues non-trivial to solve.

There are also some on going ideas/projects - one of them was to have thinLVs 
with priority to be always fully provisioned - so such thinLV could never be 
the one to have unprovisioned chunks....
Other was a better integration of filesystem with 'provisioned' volumes.


Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-28 21:43                             ` Zdenek Kabelac
@ 2018-03-01  7:14                               ` Gionatan Danti
  2018-03-01  8:31                                 ` Zdenek Kabelac
  2018-03-03 18:32                               ` Xen
  1 sibling, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2018-03-01  7:14 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Xen, LVM general discussion and development

Il 28-02-2018 22:43 Zdenek Kabelac ha scritto:
> On default - full pool starts to 'error' all 'writes' in 60 seconds.

Based on what I remember, and what you wrote below, I think "all writes" 
in the context above means "writes to unallocated areas", right? Because 
even full pool can write to already-provisioned areas.

> The main problem is - after reboot - this 'missing/unprovisioned'
> space may provide some old data...

Can you elaborate on this point? Are you referring to current behavior 
or to an hypothetical "full read-only" mode?

> It still depends - there is always some sort of 'race' - unless you
> are willing to 'give-up' too early to be always sure, considering
> there are technologies that may write many GB/s...

Sure - this was the "more-or-less" part in my sentence.

> You can use rootfs with thinp - it's very fast for testing i.e. 
> upgrades
> and quickly revert back - just there should be enough free space.

For testing, sure. However for a production machine I would rarely use 
root on thinp. Maybe my reasoning is skewed by the fact that I mostly 
work with virtual machines, so test/heavy upgrades are *not* done on the 
host itself, rather on the guest VM.

> 
> Depends on version of kernel and filesystem in use.
> 
> Note RHEL/Centos kernel has lots of backport even when it's look quite 
> old.

Sure, and this is one of the key reason why I use RHEL/CentOS rather 
than Debian/Ubuntu.

> Backups primarily sits on completely different storage.
> 
> If you keep backup of data in same pool:
> 
> 1.)
> error on this in single chunk shared by all your backup + origin -
> means it's total data loss - especially in case where filesystem are
> using 'BTrees' and some 'root node' is lost - can easily render you
> origin + all backups completely useless.
> 
> 2.)
> problems in thin-pool metadata can make all your origin+backups just
> an unordered mess of chunks.

True, but this not disprove the main point: snapshots are a invaluable 
tool in building your backup strategy. Obviously, if thin-pool meta 
volume has a problem, than all volumes (snapshot or not) become invalid. 
Do you have any recovery strategy in this case? For example, the root 
ZFS uberblock is written on *both* device start and end. Does something 
similar exists for thinp?

> 
> There are also some on going ideas/projects - one of them was to have
> thinLVs with priority to be always fully provisioned - so such thinLV
> could never be the one to have unprovisioned chunks....
> Other was a better integration of filesystem with 'provisioned' 
> volumes.

Interesting. Can you provide some more information on these projects?
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01  7:14                               ` Gionatan Danti
@ 2018-03-01  8:31                                 ` Zdenek Kabelac
  2018-03-01  9:43                                   ` Gianluca Cecchi
  2018-03-01  9:52                                   ` Gionatan Danti
  0 siblings, 2 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-01  8:31 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: Xen, LVM general discussion and development

Dne 1.3.2018 v 08:14 Gionatan Danti napsal(a):
> Il 28-02-2018 22:43 Zdenek Kabelac ha scritto:
>> On default - full pool starts to 'error' all 'writes' in 60 seconds.
> 
> Based on what I remember, and what you wrote below, I think "all writes" in 
> the context above means "writes to unallocated areas", right? Because even 
> full pool can write to already-provisioned areas.

yes

> 
>> The main problem is - after reboot - this 'missing/unprovisioned'
>> space may provide some old data...
> 
> Can you elaborate on this point? Are you referring to current behavior or to 
> an hypothetical "full read-only" mode?

If the tool wanted to write  1sector  to 256K chunk that needed provisioning,
and provisioning was not possible - after reboot - you will still see
the 'old' content.

In case of filesystem, that does not stop upon 1st. failing write you then can 
see a potential problem since  fs could issue writes - where halve of them
were possibly written and other halve was errored - then you reboot,
and that 'error' halve is actually returning 'some old data' and this can make 
filesystem seriously confused...
Fortunately both ext4 & xfs both have now correct logic here for journaling,
although IMHO still not optimal.

> True, but this not disprove the main point: snapshots are a invaluable tool in 
> building your backup strategy. Obviously, if thin-pool meta volume has a 
> problem, than all volumes (snapshot or not) become invalid. Do you have any 
> recovery strategy in this case? For example, the root ZFS uberblock is written 
> on *both* device start and end. Does something similar exists for thinp?

Unfortunately losing root blocks on thin-pool metadata is a big problem.
That's why metadata should be rather on some resilient fast storage.
Logic of writing should not let data corrupt (% broken kernel).

But yes - there is quite some room for improvement in thin_repair tool....

>> There are also some on going ideas/projects - one of them was to have
>> thinLVs with priority to be always fully provisioned - so such thinLV
>> could never be the one to have unprovisioned chunks....
>> Other was a better integration of filesystem with 'provisioned' volumes.
> 
> Interesting. Can you provide some more information on these projects?

Likely watching Joe's pages (main thin-pool creator) and whatever XFS groups 
is working on....

Also note - we are going to integrate VDO support - which will be a 2nd. way 
for thin-provisioning with different set of features - missing snapshots, but 
having compression & deduplication....

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01  8:31                                 ` Zdenek Kabelac
@ 2018-03-01  9:43                                   ` Gianluca Cecchi
  2018-03-01 11:10                                     ` Zdenek Kabelac
  2018-03-01  9:52                                   ` Gionatan Danti
  1 sibling, 1 reply; 94+ messages in thread
From: Gianluca Cecchi @ 2018-03-01  9:43 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen, Gionatan Danti

[-- Attachment #1: Type: text/plain, Size: 743 bytes --]

On Thu, Mar 1, 2018 at 9:31 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote:

>
>
> Also note - we are going to integrate VDO support - which will be a 2nd.
> way for thin-provisioning with different set of features - missing
> snapshots, but having compression & deduplication....
>
> Regards
>
> Zdenek
>
>
Interesting.
I would have expected to find it already upstream, eg inside Fedora 27 to
begin to try, but it seems not here.
I found this for upcoming RH EL 7.5:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/vdo

but nothing neither in updates-testing for f27
Is this one below the only and correct source to test on Fedora:
https://github.com/dm-vdo/vdo
?

Thanks,
Gianluca

[-- Attachment #2: Type: text/html, Size: 1508 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01  8:31                                 ` Zdenek Kabelac
  2018-03-01  9:43                                   ` Gianluca Cecchi
@ 2018-03-01  9:52                                   ` Gionatan Danti
  2018-03-01 11:23                                     ` Zdenek Kabelac
  1 sibling, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2018-03-01  9:52 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Xen, LVM general discussion and development

On 01/03/2018 09:31, Zdenek Kabelac wrote:
> If the tool wanted to write� 1sector� to 256K chunk that needed 
> provisioning,
> and provisioning was not possible - after reboot - you will still see
> the 'old' content. >
> In case of filesystem, that does not stop upon 1st. failing write you 
> then can see a potential problem since� fs could issue writes - where 
> halve of them
> were possibly written and other halve was errored - then you reboot,
> and that 'error' halve is actually returning 'some old data' and this 
> can make filesystem seriously confused...
> Fortunately both ext4 & xfs both have now correct logic here for 
> journaling,
> although IMHO still not optimal.

Ah ok, we are speaking about current "can write to allocated chunks only 
when full" behavior. This is why I would greatly appreciate a "total 
read only mode" on full pool.

Any insight on what ext4 and xfs changed to mitigate the problem? Even a 
mailing list link would be very useful ;)

> Unfortunately losing root blocks on thin-pool metadata is a big problem.
> That's why metadata should be rather on some resilient fast storage.
> Logic of writing should not let data corrupt (% broken kernel).
> 
> But yes - there is quite some room for improvement in thin_repair tool....

In the past, I fiddled with thin_dump to create backups of the metadata 
device. Do you think it is a good idea? What somewhat scares me is that, 
for thind_dump to work, the metadata device should be manually put in 
"snapshot" mode and, after the dump, it had to be unfreezed. What will 
happen if I forget to unfreeze it?

> Likely watching Joe's pages (main thin-pool creator) and whatever XFS 
> groups is working on....

Again, do you have any links for quick sharing?

> Also note - we are going to integrate VDO support - which will be a 2nd. 
> way for thin-provisioning with different set of features - missing 
> snapshots, but having compression & deduplication....

I thought compression, deduplication, send/receive, etc. where worked on 
the framework of stratis. What do you mean with "VDO support"?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01  9:43                                   ` Gianluca Cecchi
@ 2018-03-01 11:10                                     ` Zdenek Kabelac
  0 siblings, 0 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-01 11:10 UTC (permalink / raw)
  To: linux-lvm

Dne 1.3.2018 v 10:43 Gianluca Cecchi napsal(a):
> On Thu, Mar 1, 2018 at 9:31 AM, Zdenek Kabelac <zkabelac@redhat.com 
> <mailto:zkabelac@redhat.com>> wrote:
> 
> 
> 
>     Also note - we are going to integrate VDO support - which will be a 2nd.
>     way for thin-provisioning with different set of features - missing
>     snapshots, but having compression & deduplication....
> 
>     Regards
> 
>     Zdenek
> 
> 
> Interesting.
> I would have expected to find it already upstream, eg inside Fedora 27 to 
> begin to try, but it seems not here.
> I found this for upcoming RH EL 7.5:
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/vdo
> 
> but nothing neither in updates-testing for f27
> Is this one below the only and correct source to test on Fedora:
> https://github.com/dm-vdo/vdo
> ?
>

There is a COPR repository ATM available for certain f27 kernels.

For regular Fedora component  VDO target needs to go into upstream kernel 
first - but this needs some code changes for the module - so stay tuned....

Note - current model is 'standalone' usage of VDO devices - while we do plan 
to integrate support for VDO as another segtype.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01  9:52                                   ` Gionatan Danti
@ 2018-03-01 11:23                                     ` Zdenek Kabelac
  2018-03-01 12:48                                       ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-01 11:23 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

Dne 1.3.2018 v 10:52 Gionatan Danti napsal(a):
> On 01/03/2018 09:31, Zdenek Kabelac wrote:
>> If the tool wanted to write� 1sector� to 256K chunk that needed provisioning,
>> and provisioning was not possible - after reboot - you will still see
>> the 'old' content. >
>> In case of filesystem, that does not stop upon 1st. failing write you then 
>> can see a potential problem since� fs could issue writes - where halve of them
>> were possibly written and other halve was errored - then you reboot,
>> and that 'error' halve is actually returning 'some old data' and this can 
>> make filesystem seriously confused...
>> Fortunately both ext4 & xfs both have now correct logic here for journaling,
>> although IMHO still not optimal.
> 
> Ah ok, we are speaking about current "can write to allocated chunks only when 
> full" behavior. This is why I would greatly appreciate a "total read only 
> mode" on full pool.
> 
> Any insight on what ext4 and xfs changed to mitigate the problem? Even a 
> mailing list link would be very useful ;)

In general - for extX  it's remount read-only upon error - which works for 
journaled metadata - if you want same protection for 'data' you need to switch 
to rather expensive data journaling mode.

For XFS there is now similar logic where write error on journal stops 
filesystem usage - look far some older message (even here in this list) it's 
been mentioned already few times I guess...

>> Unfortunately losing root blocks on thin-pool metadata is a big problem.
>> That's why metadata should be rather on some resilient fast storage.
>> Logic of writing should not let data corrupt (% broken kernel).
>>
>> But yes - there is quite some room for improvement in thin_repair tool....
> 
> In the past, I fiddled with thin_dump to create backups of the metadata 
> device. Do you think it is a good idea? What somewhat scares me is that, for 

Depends on use-case - if you take snapshots of your thin volume, this likely 
has will not help you with recovery at all.

If your thin-volumes are rather standalone only occasionally modified 
'growing' fs  images (so no trimming ;)) - then with this metadata backup 
there can be some small chance you would be able to obtain some 'usable' 
mappings of chunks to block device layout...

Personally I'd not recommend to use this at all unless you know rather 
low-level details how this whole thing works....

> thind_dump to work, the metadata device should be manually put in "snapshot" 
> mode and, after the dump, it had to be unfreezed. What will happen if I forget 
> to unfreeze it?

Unfreezed filesystem is simply not usable...

>> Likely watching Joe's pages (main thin-pool creator) and whatever XFS groups 
>> is working on....
> 
> Again, do you have any links for quick sharing?

https://github.com/jthornber

>> Also note - we are going to integrate VDO support - which will be a 2nd. way 
>> for thin-provisioning with different set of features - missing snapshots, 
>> but having compression & deduplication....
> 
> I thought compression, deduplication, send/receive, etc. where worked on the 
> framework of stratis. What do you mean with "VDO support"?

Clearly  Startis is not a topic for lvm2 at all ;) that's all I'm going to say 
about this....

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01 11:23                                     ` Zdenek Kabelac
@ 2018-03-01 12:48                                       ` Gionatan Danti
  2018-03-01 16:00                                         ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2018-03-01 12:48 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development


On 01/03/2018 12:23, Zdenek Kabelac wrote:
> In general - for extX� it's remount read-only upon error - which works 
> for journaled metadata - if you want same protection for 'data' you need 
> to switch to rather expensive data journaling mode.
> 
> For XFS there is now similar logic where write error on journal stops 
> filesystem usage - look far some older message (even here in this list) 
> it's been mentioned already few times I guess...

Yes, we discussed here the issue. If I recall correctly, XFS journal is 
a circular buffer which will be always written to already-allocated 
chunks. From my tests (June 2017) it was clear that failing async 
writes, even with errorwhenfull=y, did not always trigger a prompt XFS 
stop (but the filesystem eventally shut down after some more 
writes/minutes).

> Depends on use-case - if you take snapshots of your thin volume, this 
> likely has will not help you with recovery at all.
> 
> If your thin-volumes are rather standalone only occasionally modified 
> 'growing' fs� images (so no trimming ;)) - then with this metadata 
> backup there can be some small chance you would be able to obtain some 
> 'usable' mappings of chunks to block device layout...
> 
> Personally I'd not recommend to use this at all unless you know rather 
> low-level details how this whole thing works....

Ok, I realized that and stopped using it for anything but testing.

> Unfreezed filesystem is simply not usable...

I was speaking about unfreezed thin metadata snapshot - ie: 
reserve_metadata_snap *without* a corresponding release_metadata_snap. 
Will that cause problems?


> Clearly� Startis is not a topic for lvm2 at all ;) that's all I'm going 
> to say about this....

OK :p

I think VDO is a fruit of Permabit acquisition, right? As it implements 
it's own thin provisioning, will thinlvm migrate to VDO or it will 
continue to use the current dmtarget?


-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01 12:48                                       ` Gionatan Danti
@ 2018-03-01 16:00                                         ` Zdenek Kabelac
  2018-03-01 16:26                                           ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-01 16:00 UTC (permalink / raw)
  To: linux-lvm, Gionatan Danti

Dne 1.3.2018 v 13:48 Gionatan Danti napsal(a):
> 
> On 01/03/2018 12:23, Zdenek Kabelac wrote:
>> In general - for extX� it's remount read-only upon error - which works for 
>> journaled metadata - if you want same protection for 'data' you need to 
>> switch to rather expensive data journaling mode.
>>
>> For XFS there is now similar logic where write error on journal stops 
>> filesystem usage - look far some older message (even here in this list) it's 
>> been mentioned already few times I guess...
> 

There is quite 'detailed' config for XFS - just not all settings
are probably tuned in the best way for provisioning.

See:


https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/xfs-error-behavior


> 
>> Unfreezed filesystem is simply not usable...
> 
> I was speaking about unfreezed thin metadata snapshot - ie: 
> reserve_metadata_snap *without* a corresponding release_metadata_snap. Will 
> that cause problems?
> 

metadata snapshot 'just consumes' thin-pool metadata space,
at any time there can be only 1 snapshot - so before next usage
you have to drop the existing one.

So IMHO it should have no other effects unless you hit some bugs...

> I think VDO is a fruit of Permabit acquisition, right? As it implements it's 
> own thin provisioning, will thinlvm migrate to VDO or it will continue to use 
> the current dmtarget?


thin-pool  target is having different goals then VDO
so both targets will likely live together.

Possibly thin-pool might be tested for using VDO data volume if it makes any 
sense...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-01 16:00                                         ` Zdenek Kabelac
@ 2018-03-01 16:26                                           ` Gionatan Danti
  0 siblings, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2018-03-01 16:26 UTC (permalink / raw)
  To: Zdenek Kabelac, linux-lvm

On 01/03/2018 17:00, Zdenek Kabelac wrote:
> metadata snapshot 'just consumes' thin-pool metadata space,
> at any time there can be only 1 snapshot - so before next usage
> you have to drop the existing one.
> 
> So IMHO it should have no other effects unless you hit some bugs...

Mmm... does it means that a not-release metadata snapshot will lead to 
increased metadata volume usage (possibly filling it faster)?

> thin-pool� target is having different goals then VDO
> so both targets will likely live together.
> 
> Possibly thin-pool might be tested for using VDO data volume if it makes 
> any sense...

Great. Thank you for the very informative discussion.
Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-28  9:26                         ` Zdenek Kabelac
  2018-02-28 19:07                           ` Gionatan Danti
@ 2018-03-03 17:52                           ` Xen
  2018-03-04 23:27                             ` Zdenek Kabelac
  1 sibling, 1 reply; 94+ messages in thread
From: Xen @ 2018-03-03 17:52 UTC (permalink / raw)
  To: linux-lvm

I did not rewrite this entire message, please excuse the parts where I 
am a little more "on the attack".



Zdenek Kabelac schreef op 28-02-2018 10:26:

> I'll probably repeat my self again, but thin provision can't be
> responsible for all kernel failures. There is no way DM team can fix
> all the related paths on this road.

Are you saying there are kernel bugs presently?

> If you don't plan to help resolving those issue - there is not point
> in complaining over and over again - we are already well aware of this
> issues...

I'm not aware of any issues, what are they?

I was responding here to an earlier thread I couldn't respond to back 
then, the topic was whether it was possible to limit thin snapshot 
sizes, you said it wasn't, I was just recapping this thread.

> If the admin can't stand failing system, he can't use thin-p.

That just sounds like a blanket excuse for any kind of failure.

> Overprovisioning on DEVICE level simply IS NOT equivalent to full
> filesystem like you would like to see all the time here and you've
> been already many times explained that filesystems are simply not
> there ready - fixes are on going but it will take its time and it's
> really pointless to exercise this on 2-3 year old kernels...

Pardon me, but your position has typically been that it is fundamentally 
impossible, not that "we're not there yet".

My questions have always been about fundamental possibilities, to which 
you always answer in the negative.

If something is fundamentally impossible, don't be surprised if you then 
don't get any help in getting there: you always close off all paths 
leading towards it.

You shut off any interest, any discussion, and any development interest 
in paths that a long time later, you then say "we're working on it" 
whereas before you always said "it's impossible".

This happened before where first you say "It's not a problem, it's admin 
error" and then a year later you say "Oh yeah, it's fixed now".

Which is it?

My interest has always been, at least philosophically, or concerning 
principle abilities, in development and design, but you shut it off 
saying it's impossible.

Now you complain you are not getting any help.

> Thin provisioning has it's use case and it expects admin is well aware
> of possible problems.

That's a blanket statement once more that says nothing about actual 
possibilities or impossibilities.

> If you are aiming for a magic box working always right - stay away
> from thin-p - the best advice....

Another blanket statement excusing any and all mistakes or errors or 
failures the system could ever have.

> Do NOT take thin snapshot of your root filesystem so you will avoid
> thin-pool overprovisioning problem.

Zdenek, could you please make up your mind?

You brought up thin snapshotting as a reason for putting root on thin, 
as a way of saying that thin failure would lead to system failure and 
not just application failure,

whereas I maintained that application failure was acceptable.

I tried to make the distinction between application level failure (due 
to filesystem errors) and system instability caused by thin.

You then tried to make those equivalent by saying that you can also put 
root on thin, in which case application failure becomes system failure.

I never wanted root on thin, so don't tell me not to snapshot it, that 
was your idea.


> Rule #1:
> 
> Thin-pool was never targeted for 'regular' usage of full thin-pool.

All you are asked is to design for error conditions.

You want only to take care of the special use case where nothing bad 
happens.

Why not just take care of the general use case where bad things can 
happen?

You know, real life?

In any development process you first don't take care of all error 
conditions, you just can't be bothered with them yet. Eventually, you 
do.

It seems you are trying to avoid having to deal with the glaring error 
conditions that have always existed, but you are trying to avoid having 
to take any responsibility for it by saying that it was not part of the 
design.

To make this more clear Zdenek, your implementation does not cater to 
the general use case of thin provisioning, but only to the special use 
case where full thin pools never happen.

That's a glaring omission in any design. You can go on and on on how 
thin-p was not "targetted" at that "use case", but that's like saying 
you built a car engine that was not "targetted" at "running out of 
fuel".

Then when the engine breaks down you say it's the user's fault.

Maybe retarget your design?

Running out of fuel is not a use case.

It's a failure condition that you have to design for.

> Full thin-pool is serious ERROR condition with bad/ill effects on 
> systems.

Yes and your job as a systems designer is to design for those error 
conditions and make sure they are handled gracefully.

You just default on your responsibility there.

The reason you brought up root on thin was to elevate application 
failure to the level of system failure so as to make them equivalent and 
then to say that you can't do anything about system failure.

This is a false analogy, we only care about application failure in the 
general use case of stuff that is allowed to happen, and we distinguish 
system failure which is not allowed to happen.

Yes Zdenek, system failure is your responsibility as the designer, it's 
not the admin's job except when he has a car that breaks down when the 
fuel runs out.

But that, clearly, would be seen as a failure on behalf of the one 
designing the engine.

You are responding so defensively when I have barely said anything that 
it is clear you feel extremely guilty about this.


> Thin-pool was designed to 'delay/postpone' real space usage - aka you
> can use more 'virtual' space with the promise you deliver real storage
> later.

That doesn't cover the full spectrum of what we consider to be "thin 
provisioning".

You only designed for a very special use case in which the fuel never 
runs out.

The aeroplane that doesn't need landing gear because it was designed to 
never run out of fuel.

In the kingdom of birds, only swallows do that. Most other birds are 
designed for landing and taking off too.

You built a system that only works if certain conditions are met.

I'm just saying you could expand your design and cover the error 
conditions as well.

So yes: I hear you, you didn't design for the error condition.

That's all I've been saying.

> So if you have different goals - like having some kind of full
> equivalency logic to full filesystem - you need to write different
> target....

Maybe I could but I still question why it was not designed into thin-p, 
and I also doubt that it couldn't be redesigned into it.

I mean I doubt that it would require a huge rewrite, I think that if I 
were to do that thing, I could start off with thin-p just fine.

Certainly, interesting, and a worthy goal.

There are a billion fun projects I would like to take on, but generally 
I am just inquiring, sometimes I am angry about stuff not working,

but when I talk about "could this be possible?" you can assume I am 
talking from a development perspective, and you don't have to constantly 
only defend the thing currently not existing.

Sometimes I am just asking about possibilities.

"Yes, it doesn't exist, and it would require that and that and that" 
would be a sufficient answer.

I don't always need to hear all of the excuses as to why it isn't so, 
sometimes I just wonder how it could be done.



>> I simply cannot reconcile an attitude that thin-full-risk is 
>> acceptable and the admin's job while at the same time advocating it 
>> for root filesystems.
> 
> Do NOT use thin-provinioning - as it's not meeting your requirements.

It was your suggestion to use thin for root as a way of artificially 
increasing those requirements and then saying that they can't be met.

> Big news -  we are at ~4.16 kernel upstream - so noone is really
> taking much care about 4.4 troubles here - sorry about that....

I said back then.

You don't really listen, do you...

> Speaking of 4.4 - I'd generally advice to jump to higher versions of
> kernel ASAP - since  4.4 has some known bad behavior in the case
> thin-pool 'metadata' get overfilled.

I never said that I was using 4.4, if you took care to read you would 
see that I was speaking about the past.

Xenial is at 4.13 right now.

> There is on going 'BOOM' project - check it out please....

Okay...

> There is not much point in commenting support for some old distros
> other then you really should try harder with your distro
> maintainers....

I was just explaining why I was experiencing hangs and you didn't know 
what I was talking about, causing some slight confusion in our threads.

>> That's a lot easier if your root filesystem doesn't lock up.
> 
> - this is not really a fault of dm thin-provisioning kernel part.

I was saying, Zdenek, that your suggestion to use root on thin was 
rather unwise.

I don't know what you're defending against, I never said anything other 
than that.

> - on going fixes to file systems are being pushed upstream (for years).
> - fixes will not appear in years old kernels as such patches are
> usually invasive so unless you use pay someone to do the backporting
> job the easiest way forward is to user newer improved kernel..

I understand that you are mixing up my system hangs with the above 
problems you would have by using root on full thin pool, I have already 
accepted that the system hangs are fixed in later kernels.

> ATM  thin-pool can't deliver equivalent logic - just like old-snaps
> can't deliver thin-pool logic.

Sure, but my question was never about "ATM".

I asked about potential, not status quo.

Please, if you keep responding to development inquiries with status quo 
answers, you will never find any help in getting there.

The "what is" and the "what is to be" don't have to be the same, but you 
are always responding defensively as to the "what is", not understanding 
the questions.

Those system hangs, sure, status quo. Those snapshots? Development 
interest.

>> However, I don't have the space for a full copy of every filesystem, 
>> so if I snapshot, I will automatically overprovision.
> 
> Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
> If you already have plan to never deliver promised space - you need to
> live with consequences....

Like I said, I was just INQUIRING about the possibility of limiting the 
size of a thin snapshot.

The fact that you respond so defensively with respect to thin pools 
overflowing, means you feel and are guilty about not taking care of that 
situation.

I was inquiring about a way to prevent thin pool overflow.

If you then suggest that the only valid use case is to have some 
auto-expanding pool, then either you are not content with just giving 
the answer to that question, or you feel it's your fault that something 
isn't possible and you try to avoid that by putting the blame on the 
user for "using it in a wrong way".

I asked a technical question. You respond like a guy who is asked why he 
didn't clean the bathroom.

According to schedule.

Easy now, I just asked whether it was possible or not.

I didn't ask you to explain why it hasn't been done.

Or where to put the blame for that.

I would say you feel rather guilty and to every insinuation that there 
is a missing feature you respond with great noise as to why the feature 
isn't actually missing.

So if I say "Is this possible?" you respond with "YOU ARE USING IT THE 
WRONG WAY" as if to feel rather uneasy to say that something isn't 
possible.

Which again, leads, of course, to bad design.

Your uneasiness Zdenek is the biggest signpost here.

Sorry to be so liberal here.


>> My snapshots are indeed meant for backups (of data volumes) ---- not 
>> for rollback ----- and for rollback ----- but only for the root 
>> filesystem.
> 
> There is more fundamental problem here:
> 
> !SNAPSHOTS ARE NOT BACKUPS!

Can you please stop screaming?

Do I have to spell out that I use the snapshot to make the backup and 
then discard the snapshot?

> This is the key problem with your thinking here (unfortunately you are
> not 'alone' with this thinking)

Yeah, maybe you shouldn't jump to conclusions and learn to read better.

>> My problem was system hangs, but my question was about limiting 
>> snapshot size on thin.
> 
> Well your problem primarily is usage of too old system....

I said "was", learn to read, Zdenek.

> Sorry to say this - but if you insist to stick with old system

Where did I say that? I said that back then, I had an 4.4 system that 
experienced these issues.

> - ask
> your distro maintainers to do all the backporting work for you - this
> is nothing lvm2 can help with...

I explained to you that our confusion back then was due to my using the 
then-current release of Ubuntu Xenial which had these problems.

I was just responding to an old thread with these conclusions:

1) Our confusion with respect to those "system hangs" was due to the 
fact that you didn't know what I was talking about, thus I thought you 
were excusing them, when you weren't.

2) My only inquiry had been about preventing snapshot overflow.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-28 19:07                           ` Gionatan Danti
  2018-02-28 21:43                             ` Zdenek Kabelac
@ 2018-03-03 18:17                             ` Xen
  2018-03-04 20:53                               ` Zdenek Kabelac
  1 sibling, 1 reply; 94+ messages in thread
From: Xen @ 2018-03-03 18:17 UTC (permalink / raw)
  To: linux-lvm

Gionatan Danti schreef op 28-02-2018 20:07:

> To recap (Zdeneck, correct me if I am wrong): the main problem is
> that, on a full pool, async writes will more-or-less silenty fail
> (with errors shown on dmesg, but nothing more).

Yes I know you were writing about that in the later emails.

> Another possible cause
> of problem is that, even on a full pool, *some* writes will complete
> correctly (the one on already allocated chunks).

Idem.

> In the past was argued that putting the entire pool in read-only mode
> (where *all* writes fail, but read are permitted to complete) would be
> a better fail-safe mechanism; however, it was stated that no current
> dmtarget permit that.

Right. Don't forget my main problem was system hangs due to older 
kernels, not the stuff you write about now.

> Two (good) solution where given, both relying on scripting (see
> "thin_command" option on lvm.conf):
> - fsfreeze on a nearly full pool (ie: >=98%);
> - replace the dmthinp target with the error target (using dmsetup).
> 
> I really think that with the good scripting infrastructure currently
> built in lvm this is a more-or-less solved problem.

I agree in practical terms. Doesn't make for good target design, but 
it's good enough, I guess.

>> Do NOT take thin snapshot of your root filesystem so you will avoid
>> thin-pool overprovisioning problem.
> 
> But is someone *really* pushing thinp for root filesystem? I always
> used it for data partition only... Sure, rollback capability on root
> is nice, but it is on data which they are *really* important.

No, Zdenek thought my system hangs resulted from something else and then 
in order to defend against that (being the fault of current DM design) 
he tried to raise the ante by claiming that root-on-thin would cause 
system failure anyway with a full pool.

I never suggested root on thin.

> In stress testing, I never saw a system crash on a full thin pool

That's good to know, I was just using Jessie and Xenial.

> We discussed that in the past also, but as snapshot volumes really are
> *regular*, writable volumes (which a 'k' flag to skip activation by
> default), the LVM team take the "safe" stance to not automatically
> drop any volume.

Sure I guess any application logic would have to be programmed outside 
of any (device mapper module) anyway.

> The solution is to use scripting/thin_command with lvm tags. For 
> example:
> - tag all snapshot with a "snap" tag;
> - when usage is dangerously high, drop all volumes with "snap" tag.

Yes, now I remember.

I was envisioning some other tag that would allow a quotum to be set for 
every volume (for example as a %) and the script would then drop the 
volumes with the larger quotas first (thus the larger snapshots) so as 
to protect smaller volumes which are probably more important and you can 
save more of them. I am ashared to admit I had forgotten about that 
completely ;-).

>> Back to rule #1 - thin-p is about 'delaying' deliverance of real 
>> space.
>> If you already have plan to never deliver promised space - you need to
>> live with consequences....
> 
> I am not sure to 100% agree on that.

When Zdenek says "thin-p" he might mean "thin-pool" but not generally 
"thin-provisioning".

I mean to say that the very special use case of an always auto-expanding 
system is a special use case of thin provisioning in general.

And I would agree, of course, that the other uses are also legit.

> Thinp is not only about
> "delaying" space provisioning; it clearly is also (mostly?) about
> fast, modern, usable snapshots. Docker, snapper, stratis, etc. all use
> thinp mainly for its fast, efficent snapshot capability.

Thank you for bringing that in.

> Denying that
> is not so useful and led to "overwarning" (ie: when snapshotting a
> volume on a virtually-fillable thin pool).

Aye.

>> !SNAPSHOTS ARE NOT BACKUPS!
> 
> Snapshot are not backups, as they do not protect from hardware
> problems (and denying that would be lame)

I was really saying that I was using them to run backups off of.

> however, they are an
> invaluable *part* of a successfull backup strategy. Having multiple
> rollaback target, even on the same machine, is a very usefull tool.

Even more you can backup running systems, but I thought that would be 
obvious.

> Again, I don't understand by we are speaking about system crashes. On
> root *not* using thinp, I never saw a system crash due to full data
> pool.

I had it on 3.18 and 4.4, that's all.

> Oh, and I use thinp on RHEL/CentOS only (Debian/Ubuntu backports are
> way too limited).

That could be it too.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-02-28 21:43                             ` Zdenek Kabelac
  2018-03-01  7:14                               ` Gionatan Danti
@ 2018-03-03 18:32                               ` Xen
  2018-03-04 20:34                                 ` Zdenek Kabelac
  1 sibling, 1 reply; 94+ messages in thread
From: Xen @ 2018-03-03 18:32 UTC (permalink / raw)
  To: linux-lvm

Zdenek Kabelac schreef op 28-02-2018 22:43:

> It still depends - there is always some sort of 'race' - unless you
> are willing to 'give-up' too early to be always sure, considering
> there are technologies that may write many GB/s...

That's why I think it is only possible for snapshots.

> You can use rootfs with thinp - it's very fast for testing i.e. 
> upgrades
> and quickly revert back - just there should be enough free space.

That's also possible with non-thin.

> Snapshot are using space - with hope that if you will 'really' need 
> that space
> you either add this space to you system - or you drop snapshots.

And I was saying back then that it would be quite easy to have a script 
that would drop bigger snapshots first (of larger volumes) given that 
those are most likely less important and more likely to prevent thin 
pool fillup, and you can save more smaller snapshots this way.

So basically I mean this gives your snapshots a "quotum" that I was 
asking about.

Lol now I remember.

You could easily give (by script) every snapshot a quotum of 20% of full 
volume size, then when 90% thin target is reached, you start dropping 
volumes with the largest quotum first, or something.

Idk, something more meaningful than that, but you get the idea.

You can calculate the "own" blocks of the snapshot and when the pool is 
full you check for snapshots that have surpassed their quotum, and the 
ones that are past their quotas in the largest numbers you drop first.

> But as said - with today 'rush' of development and load of updates -
> user do want to try 'new disto upgrade' - if it works - all is fine -
> if it doesn't let's have a quick road back -  so using thin volume for
> rootfs is pretty wanted case.

But again, regular snapshot of sufficient size does the same thing, you 
just have to allocate for it in advance, but for root this is not really 
a problem.

Then no more issue with thin-full problem.

I agree, less convenient, and a slight bit slower, but not by much for 
this special use case.

> There are also some on going ideas/projects - one of them was to have
> thinLVs with priority to be always fully provisioned - so such thinLV
> could never be the one to have unprovisioned chunks....

That's what ZFS does... ;-).

> Other was a better integration of filesystem with 'provisioned' 
> volumes.

That's what I was talking about back then...............

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-03 18:32                               ` Xen
@ 2018-03-04 20:34                                 ` Zdenek Kabelac
  0 siblings, 0 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-04 20:34 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 3.3.2018 v 19:32 Xen napsal(a):
> Zdenek Kabelac schreef op 28-02-2018 22:43:
> 
>> It still depends - there is always some sort of 'race' - unless you
>> are willing to 'give-up' too early to be always sure, considering
>> there are technologies that may write many GB/s...
> 
> That's why I think it is only possible for snapshots.
> 
>> You can use rootfs with thinp - it's very fast for testing i.e. upgrades
>> and quickly revert back - just there should be enough free space.
> 
> That's also possible with non-thin.
> 
>> Snapshot are using space - with hope that if you will 'really' need that space
>> you either add this space to you system - or you drop snapshots.
> 
> And I was saying back then that it would be quite easy to have a script that 
> would drop bigger snapshots first (of larger volumes) given that those are 
> most likely less important and more likely to prevent thin pool fillup, and 
> you can save more smaller snapshots this way.
> 
> So basically I mean this gives your snapshots a "quotum" that I was asking about.
> 
> Lol now I remember.
> 
> You could easily give (by script) every snapshot a quotum of 20% of full 
> volume size, then when 90% thin target is reached, you start dropping volumes 
> with the largest quotum first, or something.
> 
> Idk, something more meaningful than that, but you get the idea.
> 
> You can calculate the "own" blocks of the snapshot and when the pool is full 
> you check for snapshots that have surpassed their quotum, and the ones that 
> are past their quotas in the largest numbers you drop first.

I hope it's finally arriving to you that all your wishes CAN be implemented.
It's you to decide what kind of reaction and when it shall happen.

It's really only 'you' to use all the available tooling to do your own 
'dreamed' setup and lvm2  & kernel target provides the tooling.

If you however hope lvm2 will ship 'script' perfectly tuned for Xen system,
it's just you to write and send a patch...

> 
>> But as said - with today 'rush' of development and load of updates -
>> user do want to try 'new disto upgrade' - if it works - all is fine -
>> if it doesn't let's have a quick road back -� so using thin volume for
>> rootfs is pretty wanted case.
> 
> But again, regular snapshot of sufficient size does the same thing, you just 
> have to allocate for it in advance, but for root this is not really a problem.
> 
> Then no more issue with thin-full problem.
> 
> I agree, less convenient, and a slight bit slower, but not by much for this 
> special use case.

I've no idea what you mean by this...

>> There are also some on going ideas/projects - one of them was to have
>> thinLVs with priority to be always fully provisioned - so such thinLV
>> could never be the one to have unprovisioned chunks....
> 
> That's what ZFS does... ;-).

ZFS is a 'single' filesystem.

thin-pool is  multi-volume target.

It's approximately like if you would use  your  XFS/ext4 rootfs being placed 
of  ZFS ZVOL device - if you can provide an example, where this 'systems' 
works more stable & better & faster than thin-pool, it's clear bug on 
thin-pool - and your should open bugzilla for this.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-03 18:17                             ` Xen
@ 2018-03-04 20:53                               ` Zdenek Kabelac
  2018-03-05  9:42                                 ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-04 20:53 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 3.3.2018 v 19:17 Xen napsal(a):

>> In the past was argued that putting the entire pool in read-only mode
>> (where *all* writes fail, but read are permitted to complete) would be
>> a better fail-safe mechanism; however, it was stated that no current
>> dmtarget permit that.
> 
> Right. Don't forget my main problem was system hangs due to older kernels, not 
> the stuff you write about now.
> 
>> Two (good) solution where given, both relying on scripting (see
>> "thin_command" option on lvm.conf):
>> - fsfreeze on a nearly full pool (ie: >=98%);
>> - replace the dmthinp target with the error target (using dmsetup).
>>
>> I really think that with the good scripting infrastructure currently
>> built in lvm this is a more-or-less solved problem.
> 
> I agree in practical terms. Doesn't make for good target design, but it's good 
> enough, I guess.

Sometimes you have to settle on the good compromise.

There are various limitation coming from the way how Linux kernel works.

You probably still have 'vision' the block devices KNOWS from where the block 
comes from. I.E. you probably think  thin device is aware block is some 
'write' from  'gimp' made by user 'adam'.  The clear fact is - block layer 
only knows some 'pages' with some sizes needs to be written at some location 
on device - and that's all.

On the other hand all common filesystem in linux were always written to work 
on a device where the space is simply always there. So all core algorithms 
simple never counted with something like 'thin-provisioning' - this is almost 
'fine' since thin-provisioning should be almost invisible - but the problem 
starts to be visible on this over-provisioned conditions.

Unfortunately majority of filesystem never really tested well all those 
'weird' conditions which are suddenly easy to trigger with thin-pool, but 
likely almost never happens on real hdd....

So as said - situation gets better all the time, bugs are fixed as soon as the 
problematic pattern/use case is discovered - that's why it's really important 
users are opening bugzillas and report their problems with detailed 
description how to hit their problem - this really DOES help a lot.

On the other hand it's really hard to do something for users how are
just saying  'goodbye to LVM'....


>> But is someone *really* pushing thinp for root filesystem? I always
>> used it for data partition only... Sure, rollback capability on root
>> is nice, but it is on data which they are *really* important.
> 
> No, Zdenek thought my system hangs resulted from something else and then in 
> order to defend against that (being the fault of current DM design) he tried 
> to raise the ante by claiming that root-on-thin would cause system failure 
> anyway with a full pool.

Yes - this is still true.
It's a core logic of linux kernel and pages caching works.

And that's why it's important to take action *BEFORE* then trying to solve the 
case *AFTER* and hope the deadlock will not happen...


> I was envisioning some other tag that would allow a quotum to be set for every 
> volume (for example as a %) and the script would then drop the volumes with 
> the larger quotas first (thus the larger snapshots) so as to protect smaller 
> volumes which are probably more important and you can save more of them. I am 
> ashared to admit I had forgotten about that completely ;-).

Every user has quite different logic in mind - so really - we do provide 
tooling and user has to choose what fits bets...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-03 17:52                           ` Xen
@ 2018-03-04 23:27                             ` Zdenek Kabelac
  0 siblings, 0 replies; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-04 23:27 UTC (permalink / raw)
  To: LVM general discussion and development, Xen

Dne 3.3.2018 v 18:52 Xen napsal(a):
> I did not rewrite this entire message, please excuse the parts where I am a 
>> I'll probably repeat my self again, but thin provision can't be
>> responsible for all kernel failures. There is no way DM team can fix
>> all the related paths on this road.
> 
> Are you saying there are kernel bugs presently?

Hi

It's sure thing there are kernel bugs present - feel free to dive
in bugzilla list either in RH pages or kernel itself...

>> Overprovisioning on DEVICE level simply IS NOT equivalent to full
>> filesystem like you would like to see all the time here and you've
>> been already many times explained that filesystems are simply not
>> there ready - fixes are on going but it will take its time and it's
>> really pointless to exercise this on 2-3 year old kernels...
> 
> Pardon me, but your position has typically been that it is fundamentally 
> impossible, not that "we're not there yet".

Some things are still fundamentally impossible.

We are just closing/making 'time-window' where user can hit problem much smaller.

Think of as if you are seeking for a car the never crashes...


> My questions have always been about fundamental possibilities, to which you 
> always answer in the negative.

When you post 'detailed' question - you will get detailed answer.

If you ask in general - then general answer is - there are some fundamental 
kernel issues (like shared page cache) where some cases are unsolvable.

If you change your working constrain set - you can get different 
results/performance...

> If something is fundamentally impossible, don't be surprised if you then don't 
> get any help in getting there: you always close off all paths leading towards it.

The Earth can be blasted by Gamma-rays from supernova any second - can we 
prevent this?

So seriously if have scenario where it does fail - open bugzilla provide 
description/reproducer for your case.

If you seek for 1000% guaranty it will never fail - them we are sorry - this 
is not a system with 10 states you can easily get in control...

> My interest has always been, at least philosophically, or concerning principle 

We are solving real bugs not philosophy.

> abilities, in development and design, but you shut it off saying it's impossible.

Please can you stop accusing me here I'm shutting anyone here off.
Provide exact full sentences where I did that....

>> Thin provisioning has it's use case and it expects admin is well aware
>> of possible problems.
> 
> That's a blanket statement once more that says nothing about actual 
> possibilities or impossibilities.

This forum is really not about detailed description of Linux core 
functionality. You are always kindly asked to get active and learn how Linux 
kernel works.

Here we are discussing what LVM2 can do.

LVM2 usused whatever DM target + kernel provides.

So whenever I say  that something is impossible for lvm2  - it's always 
related to current state of kernel.

If them something changes in kernel to make things moving on - lvm2 can use it.


> You brought up thin snapshotting as a reason for putting root on thin, as a 
> way of saying that thin failure would lead to system failure and not just 
> application failure,
> 
> whereas I maintained that application failure was acceptable.

It's getting pointless to react on this again and again...

> 
> I tried to make the distinction between application level failure (due to 
> filesystem errors) and system instability caused by thin.
> 
> You then tried to make those equivalent by saying that you can also put root 
> on thin, in which case application failure becomes system failure.

So once again for Xen -  there *ARE* scenarios where usage of thin for your 
rootfs will block your system if thin-pool gets full - and this still applies 
for latest kernel.

On the other hand it's pretty complicated set of condition you would need to 
meet to hit this...

There should be no such case (system freeze) if you hit full thin-pool for 
non-rootfs.  A bit more 'fuzzy' question is if you will be able to recover 
your filesystem located on such thin volume....

> You want only to take care of the special use case where nothing bad happens.
> 
> Why not just take care of the general use case where bad things can happen?
> 
> You know, real life?

Because the answer  '42'  will usually not recover user's data...

The more complex answer is - we solve more trivial things first...

> In any development process you first don't take care of all error conditions, 
> you just can't be bothered with them yet. Eventually, you do.

We always do care about error paths - likely way more than you can actually 
even imagine...

That's why we to admit there are very hard to solve problems...
and solving them is way harder then educating users to use thin-pool properly.

You are probably missing how big the team behind dm & lvm2 is ;) and how busy 
this team already is....

> It seems you are trying to avoid having to deal with the glaring error 
> conditions that have always existed, but you are trying to avoid having to 
> take any responsibility for it by saying that it was not part of the design.

Yep we can mainly support 'designed' use cases -  sad but true....

> To make this more clear Zdenek, your implementation does not cater to the 

If you think lvm2 is using  dm thin-pool kernel target in bad way - open 
bugzilla how it should use this target better - my best advice here.

Keep in mind, I've not implemented dm thin-pool kernel targets....
(and filesystems and page case and linux memory model...)


> That's a glaring omission in any design. You can go on and on on how thin-p 
> was not "targetted" at that "use case", but that's like saying you built a car 
> engine that was not "targetted" at "running out of fuel".

Do you expect your engine will do any work when it runs out of fuel?

Adding more fuel/space fixes 99.999% problems with thin-pool as well.

> Then when the engine breaks down you say it's the user's fault.

When we are at this comparison:

Formula One engine can damage itself even when temperature gets too low....

Currently most users we support do prefer more speed and are taking care about 
the thin-pool to prevent its running into unsupported corner cases...

> 
> Maybe retarget your design?

When you find a lot of users with the interest of having/(paying devel) of low 
performing thin-pool where every sector update makes full validation of 
metadata......

Possibly waiting for you to show how to do it better.

I promise I'll implement lvm2 support for your DM target then when users will 
find it worthy....

> It's a failure condition that you have to design for.

You probably still missed the message - thin-pool *IS* designed to not crash 
itself!

If kernel crashes on kernel bug because of thin-pool - it'd be a serious bug 
to fix and you need to open BZ for such case.

However the problem you are usually seeing is some 'related' problem - like 
unrecoverability of filesystem sitting on top of thin volume....


> 
>> Full thin-pool is serious ERROR condition with bad/ill effects on systems.
> 
> Yes and your job as a systems designer is to design for those error conditions 
> and make sure they are handled gracefully.

Just repeating here - thin-pool is designed for out-out-space (our-of-fuel) 
case. Rest of kernel - i.e. filesystem, user-space has quite some room for 
improvements since it's not expecting it's using non-existing space....

> This is a false analogy, we only care about application failure in the general 
> use case of stuff that is allowed to happen, and we distinguish system failure 
> which is not allowed to happen.

Your system runs just set of user-space applications...

At  'block-layer' we have no idea which blocks belong to anything.

> Yes Zdenek, system failure is your responsibility as the designer, it's not 

To give you another thinking point:

Think of thin running from  thin-pool out-of-space as a device with 
unpredictable error behavior (returning WRITE errors from time to time)

Do you expect any HDD/SSD developer/manufacturer is responsible for making 
your system unstable when device hits error condition.


> But that, clearly, would be seen as a failure on behalf of the one designing 
> the engine >
> You are responding so defensively when I have barely said anything that it is 
> clear you feel extremely guilty about this.

Not at all.

And I hope to see your patches will show us all how bad/poor developers we are...

> You only designed for a very special use case in which the fuel never runs out.

So when it's been your last time you run out-of-fuel in your car?
I'm driving for many many years and never did that. And I don't even known 
ANYONE in person who run out of fuel.

I'm sure there ARE few people who managed to run out of fuel - those will have 
pretty bad day - and for sure they have to 'restart' their car! - but because 
of this possibility I'm not advocating to build  gas station every mile...

> You built a system that only works if certain conditions are met.

Yes - we have priorities, we want to address.

You happen to have different one - it's very simple to design things differently.

Just please stop us pointlessly convincing us only your goals are GOOD and 
ours are BAD and we are kindergarten kids...

It's open-source world - just make your design fly...


> So yes: I hear you, you didn't design for the error condition.
> 
> That's all I've been saying.

And you are completely misunderstanding it.

The only way you can likely to even slightly understand if - you simply start 
writing something yourself.


> I mean I doubt that it would require a huge rewrite, I think that if I were to 
> do that thing, I could start off with thin-p just fine.

Unfortunately you can't....

> Certainly, interesting, and a worthy goal.

Please just go for it.


> I was just explaining why I was experiencing hangs and you didn't know what I 
> was talking about, causing some slight confusion in our threads.

I'd experienced my window manager crashed, my kernel crashed many times, many 
apps are crashing.

When I'm getting annoyed - I sit and try to do a proper bugzilla report - and 
surprise - in a lot of case I get a fix made by maintainer or in trivial case 
i can post patch myself...

So it's my best advice also for you.

> Please, if you keep responding to development inquiries with status quo 
> answers, you will never find any help in getting there.

Well yeah - if someone asks me how he can solve existing problem today,
I'll not be answering his question with long story how he could solve in next 
decade...

It either works today or not...

There is no question  'what if I would fix  A, B, C, D, ....'


> Those system hangs, sure, status quo. Those snapshots? Development interest.

I'm confused then about which HANG your are still talking about ?

Thin-pool itself does NOT hang.

> Like I said, I was just INQUIRING about the possibility of limiting the size 
> of a thin snapshot.

And you've got the answer already many times - ATM thin-pool data structures 
are not designed to meet this request.

It really a complete redesign.

For existing thin-pool users is good enough to know total free space in 
thin-pool, and manage operations based on this.


> The fact that you respond so defensively with respect to thin pools 
> overflowing, means you feel and are guilty about not taking care of that 
> situation.

It's not about taking care - it's been intentional.
Performance and memory constrains are behind this.

If you don't care about performance and memory (aka you have different 
constrain set) - you can have your ideas supported better.

It's also worth to say, your particular case with one thin origin and just 
number of snapshots is rather very limited minor use case.
Thin-pool is more about parallel independent volume usage....

> I asked a technical question. You respond like a guy who is asked why he 
> didn't clean the bathroom.

You get your answers many times repeatedly in this list....

> Easy now, I just asked whether it was possible or not.

No it's *NOT* possible with dm thin-pool target.
It can be possible with XEN provisioning target...

> I would say you feel rather guilty and to every insinuation that there is a 
> missing feature you respond with great noise as to why the feature isn't 
> actually missing.
> 

I've not been design DM thin-pool target myself, so whatever kind of personal 
blame you are making here constantly on me, is actually completely 
irrelevant... (and this is not 1st. time you've been explained this).

> So if I say "Is this possible?" you respond with "YOU ARE USING IT THE WRONG 
> WAY" as if to feel rather uneasy to say that something isn't possible.

With all my answers - it's always related to current linux kernel
and dm thin-pool target.


> Which again, leads, of course, to bad design.
> 
> Your uneasiness Zdenek is the biggest signpost here.

I can say it's not me who is 'uneasy' here...

> 
> 2) My only inquiry had been about preventing snapshot overflow.
> 

And you were explained that supported and suggested solution is to monitor 
thin-pool and solve the problem in user-space by removing unneeded thin 
volumes ahead of time...

Is your all 'lengthy' messaging here on the list here, just because you don't 
like to 'play' the game lvm2 way??

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-04 20:53                               ` Zdenek Kabelac
@ 2018-03-05  9:42                                 ` Gionatan Danti
  2018-03-05 10:18                                   ` Zdenek Kabelac
  0 siblings, 1 reply; 94+ messages in thread
From: Gionatan Danti @ 2018-03-05  9:42 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Xen

Il 04-03-2018 21:53 Zdenek Kabelac ha scritto:
> On the other hand all common filesystem in linux were always written
> to work on a device where the space is simply always there. So all
> core algorithms simple never counted with something like
> 'thin-provisioning' - this is almost 'fine' since thin-provisioning
> should be almost invisible - but the problem starts to be visible on
> this over-provisioned conditions.
> 
> Unfortunately majority of filesystem never really tested well all
> those 'weird' conditions which are suddenly easy to trigger with
> thin-pool, but likely almost never happens on real hdd....

Hi Zdenek, I'm a little confused by that statement.
Sure, it is 100% true for EXT3/4-based filesystem; however, asking on 
XFS mailing list about that, I get the definive answer that XFS was 
adapted to cope well with thin provisioning ages ago. Is it the case?

Anyway, a more direct question: what prevented the device mapper team to 
implement a full-read-only/fail-all-writes target? I feel that *many* 
filesystem problems should be bypassed with full-read-only pools... Am I 
wrong?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-05  9:42                                 ` Gionatan Danti
@ 2018-03-05 10:18                                   ` Zdenek Kabelac
  2018-03-05 14:27                                     ` Gionatan Danti
  0 siblings, 1 reply; 94+ messages in thread
From: Zdenek Kabelac @ 2018-03-05 10:18 UTC (permalink / raw)
  To: LVM general discussion and development, Gionatan Danti; +Cc: Xen

Dne 5.3.2018 v 10:42 Gionatan Danti napsal(a):
> Il 04-03-2018 21:53 Zdenek Kabelac ha scritto:
>> On the other hand all common filesystem in linux were always written
>> to work on a device where the space is simply always there. So all
>> core algorithms simple never counted with something like
>> 'thin-provisioning' - this is almost 'fine' since thin-provisioning
>> should be almost invisible - but the problem starts to be visible on
>> this over-provisioned conditions.
>>
>> Unfortunately majority of filesystem never really tested well all
>> those 'weird' conditions which are suddenly easy to trigger with
>> thin-pool, but likely almost never happens on real hdd....
> 
> Hi Zdenek, I'm a little confused by that statement.
> Sure, it is 100% true for EXT3/4-based filesystem; however, asking on XFS 
> mailing list about that, I get the definive answer that XFS was adapted to 
> cope well with thin provisioning ages ago. Is it the case?

Yes - it has been updated/improved/fixed - and I've already given you a link 
where you can configure the behavior of XFS when i.e. device reports  ENOSPC 
to the filesystem.

What need to be understood here is - filesystem were not originally designed
to ever see such kind of errors - where you simply created filesystem in past, 
the space was meant to be there all the time.

> Anyway, a more direct question: what prevented the device mapper team to 
> implement a full-read-only/fail-all-writes target? I feel that *many* 
> filesystem problems should be bypassed with full-read-only pools... Am I wrong?

Well complexity - it might look 'easy' to do on the first sight, but in 
reality it's impacting all hot/fast paths with number of checks and it would 
have rather dramatic performance impact.

The other case is, while for lots of filesystems it might look like best thing 
- it's not always true - so there are case where it's more desired
to have still working device with 'several' failing piece in it...

And 3rd moment is - it's unclear from kernel POV - where this 'full' pool 
moment actually happens - i.e. imagine running  'write' operation on one thin 
device and 'trim/discard' operation running on 2nd. device.

So it's been left on user-space to solve the case the best way -
i.e. user-space can initiate  'fstrim' itself when full pool case happens or 
get the space by number of other ways...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
  2018-03-05 10:18                                   ` Zdenek Kabelac
@ 2018-03-05 14:27                                     ` Gionatan Danti
  0 siblings, 0 replies; 94+ messages in thread
From: Gionatan Danti @ 2018-03-05 14:27 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development; +Cc: Xen

On 05/03/2018 11:18, Zdenek Kabelac wrote:
> Yes - it has been updated/improved/fixed - and I've already given you a 
> link where you can configure the behavior of XFS when i.e. device 
> reports� ENOSPC to the filesystem.

Sure - I already studied it months ago during my testing. I simply was 
under the impression that dm & xfs teams have different point of view 
regarding the actual status. I'm happy to know that it isn't the case :)

> Well complexity - it might look 'easy' to do on the first sight, but in 
> reality it's impacting all hot/fast paths with number of checks and it 
> would have rather dramatic performance impact.
> 
> The other case is, while for lots of filesystems it might look like best 
> thing - it's not always true - so there are case where it's more desired
> to have still working device with 'several' failing piece in it...
> 
> And 3rd moment is - it's unclear from kernel POV - where this 'full' 
> pool moment actually happens - i.e. imagine running� 'write' operation 
> on one thin device and 'trim/discard' operation running on 2nd. device.
> 
> So it's been left on user-space to solve the case the best way -
> i.e. user-space can initiate� 'fstrim' itself when full pool case 
> happens or get the space by number of other ways...

Ok, I see.
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2018-03-05 14:27 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06 14:31 [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM Gionatan Danti
2017-04-07  8:19 ` Mark Mielke
2017-04-07  9:12   ` Gionatan Danti
2017-04-07 13:50     ` L A Walsh
2017-04-07 16:33       ` Gionatan Danti
2017-04-13 12:59         ` Stuart Gathman
2017-04-13 13:52           ` Xen
2017-04-13 14:33             ` Zdenek Kabelac
2017-04-13 14:47               ` Xen
2017-04-13 15:29               ` Stuart Gathman
2017-04-13 15:43                 ` Xen
2017-04-13 17:26                   ` Stuart D. Gathman
2017-04-13 17:32                   ` Stuart D. Gathman
2017-04-14 15:17                     ` Xen
2017-04-14  7:27               ` Gionatan Danti
2017-04-14  7:23           ` Gionatan Danti
2017-04-14 15:23             ` Xen
2017-04-14 15:53               ` Gionatan Danti
2017-04-14 16:08                 ` Stuart Gathman
2017-04-14 17:36                 ` Xen
2017-04-14 18:59                   ` Gionatan Danti
2017-04-14 19:20                     ` Xen
2017-04-15  8:27                       ` Xen
2017-04-15 23:35                         ` Xen
2017-04-17 12:33                         ` Xen
2017-04-15 21:22                     ` Xen
2017-04-15 21:49                       ` Xen
2017-04-15 21:48                     ` Xen
2017-04-18 10:17                       ` Zdenek Kabelac
2017-04-18 13:23                         ` Gionatan Danti
2017-04-18 14:32                           ` Stuart D. Gathman
2017-04-19  7:22                         ` Xen
2017-04-07 22:24     ` Mark Mielke
2017-04-08 11:56       ` Gionatan Danti
2017-04-07 18:21 ` Tomas Dalebjörk
2017-04-13 10:20 ` Gionatan Danti
2017-04-13 12:41   ` Xen
2017-04-14  7:20     ` Gionatan Danti
2017-04-14  8:24       ` Zdenek Kabelac
2017-04-14  9:07         ` Gionatan Danti
2017-04-14  9:37           ` Zdenek Kabelac
2017-04-14  9:55             ` Gionatan Danti
2017-04-22  7:14         ` Gionatan Danti
2017-04-22 16:32           ` Xen
2017-04-22 20:58             ` Gionatan Danti
2017-04-22 21:17             ` Zdenek Kabelac
2017-04-23  5:29               ` Xen
2017-04-23  9:26                 ` Zdenek Kabelac
2017-04-24 21:02                   ` Xen
2017-04-24 21:59                     ` Zdenek Kabelac
2017-04-26  7:26                       ` Gionatan Danti
2017-04-26  7:42                         ` Zdenek Kabelac
2017-04-26  8:10                           ` Gionatan Danti
2017-04-26 11:23                             ` Zdenek Kabelac
2017-04-26 13:37                               ` Gionatan Danti
2017-04-26 14:33                                 ` Zdenek Kabelac
2017-04-26 16:37                                   ` Gionatan Danti
2017-04-26 18:32                                     ` Stuart Gathman
2017-04-26 19:24                                     ` Stuart Gathman
2017-05-02 11:00                                     ` Gionatan Danti
2017-05-12 13:02                                       ` Gionatan Danti
2017-05-12 13:42                                         ` Joe Thornber
2017-05-14 20:39                                           ` Gionatan Danti
2017-05-15 12:50                                             ` Zdenek Kabelac
2017-05-15 14:48                                               ` Gionatan Danti
2017-05-15 15:33                                                 ` Zdenek Kabelac
2017-05-16  7:53                                                   ` Gionatan Danti
2017-05-16 10:54                                                     ` Zdenek Kabelac
2017-05-16 13:38                                                       ` Gionatan Danti
2018-02-27 18:39                       ` Xen
2018-02-28  9:26                         ` Zdenek Kabelac
2018-02-28 19:07                           ` Gionatan Danti
2018-02-28 21:43                             ` Zdenek Kabelac
2018-03-01  7:14                               ` Gionatan Danti
2018-03-01  8:31                                 ` Zdenek Kabelac
2018-03-01  9:43                                   ` Gianluca Cecchi
2018-03-01 11:10                                     ` Zdenek Kabelac
2018-03-01  9:52                                   ` Gionatan Danti
2018-03-01 11:23                                     ` Zdenek Kabelac
2018-03-01 12:48                                       ` Gionatan Danti
2018-03-01 16:00                                         ` Zdenek Kabelac
2018-03-01 16:26                                           ` Gionatan Danti
2018-03-03 18:32                               ` Xen
2018-03-04 20:34                                 ` Zdenek Kabelac
2018-03-03 18:17                             ` Xen
2018-03-04 20:53                               ` Zdenek Kabelac
2018-03-05  9:42                                 ` Gionatan Danti
2018-03-05 10:18                                   ` Zdenek Kabelac
2018-03-05 14:27                                     ` Gionatan Danti
2018-03-03 17:52                           ` Xen
2018-03-04 23:27                             ` Zdenek Kabelac
2017-04-22 21:22           ` Zdenek Kabelac
2017-04-24 13:49             ` Gionatan Danti
2017-04-24 14:48               ` Zdenek Kabelac

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).