[linux-lvm] LVM performance vs direct dm-thin

linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed

* [linux-lvm] LVM performance vs direct dm-thin
@ 2022-01-29 20:34 Demi Marie Obenour
  2022-01-29 21:32 ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-01-29 20:34 UTC (permalink / raw)
  To: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 611 bytes --]

How much slower are operations on an LVM2 thin pool compared to manually
managing a dm-thin target via ioctls?  I am mostly concerned about
volume snapshot, creation, and destruction.  Data integrity is very
important, so taking shortcuts that risk data loss is out of the
question.  However, the application may have some additional information
that LVM2 does not have.  For instance, it may know that the volume that
it is snapshotting is not in use, or that a certain volume it is
creating will never be used after power-off.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-29 20:34 [linux-lvm] LVM performance vs direct dm-thin Demi Marie Obenour
@ 2022-01-29 21:32 ` Zdenek Kabelac
  2022-01-30  0:32   ` Demi Marie Obenour
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2022-01-29 21:32 UTC (permalink / raw)
  To: LVM general discussion and development, Demi Marie Obenour

Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
> How much slower are operations on an LVM2 thin pool compared to manually
> managing a dm-thin target via ioctls?  I am mostly concerned about
> volume snapshot, creation, and destruction.  Data integrity is very
> important, so taking shortcuts that risk data loss is out of the
> question.  However, the application may have some additional information
> that LVM2 does not have.  For instance, it may know that the volume that
> it is snapshotting is not in use, or that a certain volume it is
> creating will never be used after power-off.
> 

Hi

Short answer: it depends ;)

Longer story:
If you want to create few thins per hour - than it doesn't really matter.
If you want to create few thins in a second - than the cost of lvm2 management 
is very high  - as lvm2 does far more work then just sending a simple ioctl 
(as it's called logical volume management for a reason)

So brave developers may always write their own management tools for their 
constrained environment requirements that will by significantly faster in 
terms of how many thins you could create per minute (btw you will need to also 
consider dropping usage of udev on such system)

It's worth to mention - the more bullet-proof you will want to make your 
project - the more closer to the extra processing made by lvm2 you will get.

However before you will step into these waters - you should probably evaluate 
whether thin-pool actually meet your needs if you have that high expectation 
for number of supported volumes - so you will not end up with hyper fast 
snapshot creation while the actual usage then is not meeting your needs...

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-29 21:32 ` Zdenek Kabelac
@ 2022-01-30  0:32   ` Demi Marie Obenour
  2022-01-30 10:52     ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-01-30  0:32 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 2633 bytes --]

On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
> Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
> > How much slower are operations on an LVM2 thin pool compared to manually
> > managing a dm-thin target via ioctls?  I am mostly concerned about
> > volume snapshot, creation, and destruction.  Data integrity is very
> > important, so taking shortcuts that risk data loss is out of the
> > question.  However, the application may have some additional information
> > that LVM2 does not have.  For instance, it may know that the volume that
> > it is snapshotting is not in use, or that a certain volume it is
> > creating will never be used after power-off.
> > 
> 
> Hi
> 
> Short answer: it depends ;)
> 
> Longer story:
> If you want to create few thins per hour - than it doesn't really matter.
> If you want to create few thins in a second - than the cost of lvm2
> management is very high  - as lvm2 does far more work then just sending a
> simple ioctl (as it's called logical volume management for a reason)

Qubes OS definitely falls into the second category.  Starting a qube
(virtual machine) generally involves creating three thins (one fresh and
two snapshots).  Furthermore, Qubes OS frequently starts qubes in
response to user actions, so thin volume creation speed directly impacts
system responsiveness.

> So brave developers may always write their own management tools for their
> constrained environment requirements that will by significantly faster in
> terms of how many thins you could create per minute (btw you will need to
> also consider dropping usage of udev on such system)

What kind of constraints are you referring to?  Is it possible and safe
to have udev running, but told to ignore the thins in question?

> It's worth to mention - the more bullet-proof you will want to make your
> project - the more closer to the extra processing made by lvm2 you will get.

Why is this?  How does lvm2 compare to stratis, for example?

> However before you will step into these waters - you should probably
> evaluate whether thin-pool actually meet your needs if you have that high
> expectation for number of supported volumes - so you will not end up with
> hyper fast snapshot creation while the actual usage then is not meeting your
> needs...

What needs are you thinking of specifically?  Qubes OS needs block
devices, so filesystem-backed storage would require the use of loop
devices unless I use ZFS zvols.  Do you have any specific
recommendations?

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30  0:32   ` Demi Marie Obenour
@ 2022-01-30 10:52     ` Zdenek Kabelac
  2022-01-30 16:45       ` Demi Marie Obenour
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2022-01-30 10:52 UTC (permalink / raw)
  To: Demi Marie Obenour; +Cc: LVM general discussion and development

Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
> On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
>> Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
>>> How much slower are operations on an LVM2 thin pool compared to manually
>>> managing a dm-thin target via ioctls?  I am mostly concerned about
>>> volume snapshot, creation, and destruction.  Data integrity is very
>>> important, so taking shortcuts that risk data loss is out of the
>>> question.  However, the application may have some additional information
>>> that LVM2 does not have.  For instance, it may know that the volume that
>>> it is snapshotting is not in use, or that a certain volume it is
>>> creating will never be used after power-off.
>>>
> 
>> So brave developers may always write their own management tools for their
>> constrained environment requirements that will by significantly faster in
>> terms of how many thins you could create per minute (btw you will need to
>> also consider dropping usage of udev on such system)
> 
> What kind of constraints are you referring to?  Is it possible and safe
> to have udev running, but told to ignore the thins in question?

Lvm2 is oriented more towards managing set of different disks,
where user is adding/removing/replacing them.  So it's more about 
recoverability, good support for manual repair  (ascii metadata),
tracking history of changes,  backward compatibility, support
of conversion to different volume types (i.e. caching of thins, pvmove...)
Support for no/udev & no/systemd, clusters and nearly every linux distro 
available... So there is a lot - and this all adds quite complexity.

So once you scratch all this - and you say you only care about single disc 
then you are able to use more efficient metadata formats which you could even 
keep permanently in memory during the lifetime - this all adds great performance.

But it all depends how you could constrain your environment.

It's worth to mention there is lvm2 support for 'external' 'thin volume' 
creators - so lvm2 only maintains 'thin-pool' data & metadata LV - but thin 
volume creation, activation, deactivation of thins is left to external tool.
This has been used by docker for a while - later on they switched to overlayFs 
I believe..

> 
>> It's worth to mention - the more bullet-proof you will want to make your
>> project - the more closer to the extra processing made by lvm2 you will get.
> 
> Why is this?  How does lvm2 compare to stratis, for example?

Stratis is yet another volume manager written in Rust combined with XFS for 
easier user experience. That's all I'd probably say about it...

>> However before you will step into these waters - you should probably
>> evaluate whether thin-pool actually meet your needs if you have that high
>> expectation for number of supported volumes - so you will not end up with
>> hyper fast snapshot creation while the actual usage then is not meeting your
>> needs...
> 
> What needs are you thinking of specifically?  Qubes OS needs block
> devices, so filesystem-backed storage would require the use of loop
> devices unless I use ZFS zvols.  Do you have any specific
> recommendations?

As long as you live in the world without crashes, buggy kernels, apps  and 
failing hard drives everything looks very simple.
And every development costs quite some time & money.

Since you mentioned ZFS - you might want focus on using 'ZFS-only' solution.
Combining  ZFS or Btrfs with lvm2 is always going to be a painful way as those 
filesystems have their own volume management.

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 10:52     ` Zdenek Kabelac
@ 2022-01-30 16:45       ` Demi Marie Obenour
  2022-01-30 17:43         ` Zdenek Kabelac
  2022-01-30 21:39         ` Stuart D. Gathman
  0 siblings, 2 replies; 22+ messages in thread
From: Demi Marie Obenour @ 2022-01-30 16:45 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

[-- Attachment #1.1: Type: text/plain, Size: 6200 bytes --]

On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
> Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
> > On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
> > > Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
> > > > How much slower are operations on an LVM2 thin pool compared to manually
> > > > managing a dm-thin target via ioctls?  I am mostly concerned about
> > > > volume snapshot, creation, and destruction.  Data integrity is very
> > > > important, so taking shortcuts that risk data loss is out of the
> > > > question.  However, the application may have some additional information
> > > > that LVM2 does not have.  For instance, it may know that the volume that
> > > > it is snapshotting is not in use, or that a certain volume it is
> > > > creating will never be used after power-off.
> > > > 
> > 
> > > So brave developers may always write their own management tools for their
> > > constrained environment requirements that will by significantly faster in
> > > terms of how many thins you could create per minute (btw you will need to
> > > also consider dropping usage of udev on such system)
> > 
> > What kind of constraints are you referring to?  Is it possible and safe
> > to have udev running, but told to ignore the thins in question?
> 
> Lvm2 is oriented more towards managing set of different disks,
> where user is adding/removing/replacing them.  So it's more about
> recoverability, good support for manual repair  (ascii metadata),
> tracking history of changes,  backward compatibility, support
> of conversion to different volume types (i.e. caching of thins, pvmove...)
> Support for no/udev & no/systemd, clusters and nearly every linux distro
> available... So there is a lot - and this all adds quite complexity.

I am certain it does, and that makes a lot of sense.  Thanks for the
hard work!  Those features are all useful for Qubes OS, too — just not
in the VM startup/shutdown path.

> So once you scratch all this - and you say you only care about single disc
> then you are able to use more efficient metadata formats which you could
> even keep permanently in memory during the lifetime - this all adds great
> performance.
> 
> But it all depends how you could constrain your environment.
> 
> It's worth to mention there is lvm2 support for 'external' 'thin volume'
> creators - so lvm2 only maintains 'thin-pool' data & metadata LV - but thin
> volume creation, activation, deactivation of thins is left to external tool.
> This has been used by docker for a while - later on they switched to
> overlayFs I believe..

That indeeds sounds like a good choice for Qubes OS.  It would allow the
data and metadata LVs to be any volume type that lvm2 supports, and
managed using all of lvm2’s features.  So one could still put the
metadata on a RAID-10 volume while everything else is RAID-6, or set up
a dm-cache volume to store the data (please correct me if I am wrong).
Qubes OS has already moved to using a separate thin pool for virtual
machines, as it prevents dom0 (privileged management VM) from being run
out of disk space (by accident or malice).  That means that the thin
pool use for guests is managed only by Qubes OS, and so the standard
lvm2 tools do not need to touch it.

Is this a setup that you would recommend, and would be comfortable using
in production?  As far as metadata is concerned, Qubes OS has its own
XML file containing metadata about all qubes, which should suffice for
this purpose.  To prevent races during updates and ensure automatic
crash recovery, is it sufficient to store metadata for both new and old
transaction IDs, and pick the correct one based on the device-mapper
status line?  I have seen lvm2 get in an inconsistent state (transaction
ID off by one) that required manual repair before, which is quite
unnerving for a desktop OS.

One feature that would be nice is to be able to import an
externally-provided mapping of thin pool device numbers to LV names, so
that lvm2 could provide a (read-only, and not guaranteed fresh) view of
system state for reporting purposes.

> > > It's worth to mention - the more bullet-proof you will want to make your
> > > project - the more closer to the extra processing made by lvm2 you will get.
> > 
> > Why is this?  How does lvm2 compare to stratis, for example?
> 
> Stratis is yet another volume manager written in Rust combined with XFS for
> easier user experience. That's all I'd probably say about it...

That’s fine.  I guess my question is why making lvm2 bullet-proof needs
so much overhead.

> > > However before you will step into these waters - you should probably
> > > evaluate whether thin-pool actually meet your needs if you have that high
> > > expectation for number of supported volumes - so you will not end up with
> > > hyper fast snapshot creation while the actual usage then is not meeting your
> > > needs...
> > 
> > What needs are you thinking of specifically?  Qubes OS needs block
> > devices, so filesystem-backed storage would require the use of loop
> > devices unless I use ZFS zvols.  Do you have any specific
> > recommendations?
> 
> As long as you live in the world without crashes, buggy kernels, apps  and
> failing hard drives everything looks very simple.

Would you mind explaining further?  LVM2 RAID and cache volumes should
provide most of the benefits that Qubes OS desires, unless I am missing
something.

> And every development costs quite some time & money.

That it does.

> Since you mentioned ZFS - you might want focus on using 'ZFS-only' solution.
> Combining  ZFS or Btrfs with lvm2 is always going to be a painful way as
> those filesystems have their own volume management.

Absolutely!  That said, I do wonder what your thoughts on using loop
devices for VM storage are.  I know they are slower than thin volumes,
but they are also much easier to manage, since they are just ordinary
disk files.  Any filesystem with reflink can provide the needed
copy-on-write support.

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 16:45       ` Demi Marie Obenour
@ 2022-01-30 17:43         ` Zdenek Kabelac
  2022-01-30 20:27           ` Gionatan Danti
  2022-02-02  2:09           ` Demi Marie Obenour
  2022-01-30 21:39         ` Stuart D. Gathman
  1 sibling, 2 replies; 22+ messages in thread
From: Zdenek Kabelac @ 2022-01-30 17:43 UTC (permalink / raw)
  To: Demi Marie Obenour; +Cc: LVM general discussion and development

Dne 30. 01. 22 v 17:45 Demi Marie Obenour napsal(a):
> On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
>> Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
>>> On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
>>>> Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
>>>>> How much slower are operations on an LVM2 thin pool compared to manually
>>>>> managing a dm-thin target via ioctls?  I am mostly concerned about
>>>>> volume snapshot, creation, and destruction.  Data integrity is very
>>>>> important, so taking shortcuts that risk data loss is out of the
>>>>> question.  However, the application may have some additional information
>>>>> that LVM2 does not have.  For instance, it may know that the volume that
>>>>> it is snapshotting is not in use, or that a certain volume it is
>>>>> creating will never be used after power-off.
>>>>>
>>>
>>>> So brave developers may always write their own management tools for their
>>>> constrained environment requirements that will by significantly faster in
>>>> terms of how many thins you could create per minute (btw you will need to
>>>> also consider dropping usage of udev on such system)
>>>
>>> What kind of constraints are you referring to?  Is it possible and safe
>>> to have udev running, but told to ignore the thins in question?
>>
>> Lvm2 is oriented more towards managing set of different disks,
>> where user is adding/removing/replacing them.  So it's more about
>> recoverability, good support for manual repair  (ascii metadata),
>> tracking history of changes,  backward compatibility, support
>> of conversion to different volume types (i.e. caching of thins, pvmove...)
>> Support for no/udev & no/systemd, clusters and nearly every linux distro
>> available... So there is a lot - and this all adds quite complexity.
> 
> I am certain it does, and that makes a lot of sense.  Thanks for the
> hard work!  Those features are all useful for Qubes OS, too — just not
> in the VM startup/shutdown path.
> 
>> So once you scratch all this - and you say you only care about single disc
>> then you are able to use more efficient metadata formats which you could
>> even keep permanently in memory during the lifetime - this all adds great
>> performance.
>>
>> But it all depends how you could constrain your environment.
>>
>> It's worth to mention there is lvm2 support for 'external' 'thin volume'
>> creators - so lvm2 only maintains 'thin-pool' data & metadata LV - but thin
>> volume creation, activation, deactivation of thins is left to external tool.
>> This has been used by docker for a while - later on they switched to
>> overlayFs I believe..
> 
> That indeeds sounds like a good choice for Qubes OS.  It would allow the
> data and metadata LVs to be any volume type that lvm2 supports, and
> managed using all of lvm2’s features.  So one could still put the
> metadata on a RAID-10 volume while everything else is RAID-6, or set up
> a dm-cache volume to store the data (please correct me if I am wrong).
> Qubes OS has already moved to using a separate thin pool for virtual
> machines, as it prevents dom0 (privileged management VM) from being run
> out of disk space (by accident or malice).  That means that the thin
> pool use for guests is managed only by Qubes OS, and so the standard
> lvm2 tools do not need to touch it.
> 
> Is this a setup that you would recommend, and would be comfortable using
> in production?  As far as metadata is concerned, Qubes OS has its own
> XML file containing metadata about all qubes, which should suffice for
> this purpose.  To prevent races during updates and ensure automatic
> crash recovery, is it sufficient to store metadata for both new and old
> transaction IDs, and pick the correct one based on the device-mapper
> status line?  I have seen lvm2 get in an inconsistent state (transaction
> ID off by one) that required manual repair before, which is quite
> unnerving for a desktop OS.

My biased advice would be to stay with lvm2. There is lot of work, many things 
are not well documented and getting everything running correctly will take a 
lot of effort  (Docker in fact did not managed to do it well and was incapable 
to provide any recoverability)

> One feature that would be nice is to be able to import an
> externally-provided mapping of thin pool device numbers to LV names, so
> that lvm2 could provide a (read-only, and not guaranteed fresh) view of
> system state for reporting purposes.

Once you will have evidence it's the lvm2 causing major issue - you could 
consider whether it's worth to step into a separate project.


>>>> It's worth to mention - the more bullet-proof you will want to make your
>>>> project - the more closer to the extra processing made by lvm2 you will get.
>>>
>>> Why is this?  How does lvm2 compare to stratis, for example?
>>
>> Stratis is yet another volume manager written in Rust combined with XFS for
>> easier user experience. That's all I'd probably say about it...
> 
> That’s fine.  I guess my question is why making lvm2 bullet-proof needs
> so much overhead.

It's difficult - if you would be distributing lvm2 with exact kernel version & 
udev & systemd with a single linux distro - it reduces huge set of troubles...

>>>> However before you will step into these waters - you should probably
>>>> evaluate whether thin-pool actually meet your needs if you have that high
>>>> expectation for number of supported volumes - so you will not end up with
>>>> hyper fast snapshot creation while the actual usage then is not meeting your
>>>> needs...
>>>
>>> What needs are you thinking of specifically?  Qubes OS needs block
>>> devices, so filesystem-backed storage would require the use of loop
>>> devices unless I use ZFS zvols.  Do you have any specific
>>> recommendations?
>>
>> As long as you live in the world without crashes, buggy kernels, apps  and
>> failing hard drives everything looks very simple.
> 
> Would you mind explaining further?  LVM2 RAID and cache volumes should
> provide most of the benefits that Qubes OS desires, unless I am missing
> something.

I'm not familiar with QubesOS - but in many cases in real-life world we can't 
push to our users latest&greatest - so we need to live with bugs and add 
workarounds...

>> And every development costs quite some time & money.
> 
> That it does.
> 
>> Since you mentioned ZFS - you might want focus on using 'ZFS-only' solution.
>> Combining  ZFS or Btrfs with lvm2 is always going to be a painful way as
>> those filesystems have their own volume management.
> 
> Absolutely!  That said, I do wonder what your thoughts on using loop
> devices for VM storage are.  I know they are slower than thin volumes,
> but they are also much easier to manage, since they are just ordinary
> disk files.  Any filesystem with reflink can provide the needed
> copy-on-write support.

Chain filesystem->block_layer->filesystem->block_layer is something you most 
likely do not want to use for any well performing solution...
But it's ok for testing...

Regards

Zdenek



_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 17:43         ` Zdenek Kabelac
@ 2022-01-30 20:27           ` Gionatan Danti
  2022-01-30 21:17             ` Demi Marie Obenour
  2022-02-02  2:09           ` Demi Marie Obenour
  1 sibling, 1 reply; 22+ messages in thread
From: Gionatan Danti @ 2022-01-30 20:27 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Demi Marie Obenour

Il 2022-01-30 18:43 Zdenek Kabelac ha scritto:
> Chain filesystem->block_layer->filesystem->block_layer is something
> you most likely do not want to use for any well performing solution...
> But it's ok for testing...

I second that.

Demi Marie - just a question: are you sure do you really needs a block 
device? I don't know QubeOS, but both KVM and Xen can use files as 
virtual disks. This would enable you to ignore loopback mounts.

Regards.


-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 20:27           ` Gionatan Danti
@ 2022-01-30 21:17             ` Demi Marie Obenour
  2022-01-31  7:52               ` Gionatan Danti
  0 siblings, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-01-30 21:17 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 864 bytes --]

On Sun, Jan 30, 2022 at 09:27:56PM +0100, Gionatan Danti wrote:
> Il 2022-01-30 18:43 Zdenek Kabelac ha scritto:
> > Chain filesystem->block_layer->filesystem->block_layer is something
> > you most likely do not want to use for any well performing solution...
> > But it's ok for testing...
> 
> I second that.
> 
> Demi Marie - just a question: are you sure do you really needs a block
> device? I don't know QubeOS, but both KVM and Xen can use files as virtual
> disks. This would enable you to ignore loopback mounts.

On Xen, the paravirtualised block backend driver (blkback) requires a
block device, so file-based virtual disks are implemented with a loop
device managed by the toolstack.  Suggestions for improving this
less-than-satisfactory situation are welcome.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 16:45       ` Demi Marie Obenour
  2022-01-30 17:43         ` Zdenek Kabelac
@ 2022-01-30 21:39         ` Stuart D. Gathman
  2022-01-30 22:14           ` Demi Marie Obenour
  2022-01-31  7:47           ` Gionatan Danti
  1 sibling, 2 replies; 22+ messages in thread
From: Stuart D. Gathman @ 2022-01-30 21:39 UTC (permalink / raw)
  To: LVM general discussion and development, Zdenek Kabelac

On Sun, 2022-01-30 at 11:45 -0500, Demi Marie Obenour wrote:
> On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
> > 
> 
> > Since you mentioned ZFS - you might want focus on using 'ZFS-only'
> > solution.
> > Combining  ZFS or Btrfs with lvm2 is always going to be a painful
> > way as
> > those filesystems have their own volume management.
> 
> Absolutely!  That said, I do wonder what your thoughts on using loop
> devices for VM storage are.  I know they are slower than thin
> volumes,
> but they are also much easier to manage, since they are just ordinary
> disk files.  Any filesystem with reflink can provide the needed
> copy-on-write support.

I use loop devices for test cases - especially with simulated IO
errors.  Devs really appreciate having an easy reproducer for
database/filesystem bugs (which often involve handling of IO errors). 
But not for production VMs.

I use LVM as flexible partitions (i.e. only classic LVs, no thin pool).
Classic LVs perform like partitions, literally using the same driver
(device mapper) with a small number of extents, and are if anything
more recoverable than partition tables.  We used to put LVM on bare
drives (like AIX did) - who needs a partition table?  But on Wintel,
you need a partition table for EFI and so that alien operating systems
know there is something already on a disk.

Your VM usage is different from ours - you seem to need to clone and
activate a VM quickly (like a vps provider might need to do).  We
generally have to buy more RAM to add a new VM :-), so performance of
creating a new LV is the least of our worries.

Since we use LVs like partitions - mixing with btrfs is not an issue. 
Just use the LVs like partitions.  I haven't tried ZFS on linux - it
may have LVM like features that could fight with LVM.  ZFS would be my
first choice on a BSD box.

We do not use LVM raid - but either run mdraid underneath, or let btrfs
do it's data duplication thing with LVs on different spindles.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 21:39         ` Stuart D. Gathman
@ 2022-01-30 22:14           ` Demi Marie Obenour
  2022-01-31 21:29             ` Marian Csontos
  2022-01-31  7:47           ` Gionatan Danti
  1 sibling, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-01-30 22:14 UTC (permalink / raw)
  To: LVM general discussion and development
  Cc: Marek Marczykowski-Górecki, Zdenek Kabelac


[-- Attachment #1.1: Type: text/plain, Size: 1945 bytes --]

On Sun, Jan 30, 2022 at 04:39:30PM -0500, Stuart D. Gathman wrote:
> On Sun, 2022-01-30 at 11:45 -0500, Demi Marie Obenour wrote:
> > On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
> > > 
> > 
> > > Since you mentioned ZFS - you might want focus on using 'ZFS-only'
> > > solution.
> > > Combining  ZFS or Btrfs with lvm2 is always going to be a painful
> > > way as
> > > those filesystems have their own volume management.
> > 
> > Absolutely!  That said, I do wonder what your thoughts on using loop
> > devices for VM storage are.  I know they are slower than thin
> > volumes,
> > but they are also much easier to manage, since they are just ordinary
> > disk files.  Any filesystem with reflink can provide the needed
> > copy-on-write support.
> 
> I use loop devices for test cases - especially with simulated IO
> errors.  Devs really appreciate having an easy reproducer for
> database/filesystem bugs (which often involve handling of IO errors). 
> But not for production VMs.
> 
> I use LVM as flexible partitions (i.e. only classic LVs, no thin pool).
> Classic LVs perform like partitions, literally using the same driver
> (device mapper) with a small number of extents, and are if anything
> more recoverable than partition tables.  We used to put LVM on bare
> drives (like AIX did) - who needs a partition table?  But on Wintel,
> you need a partition table for EFI and so that alien operating systems
> know there is something already on a disk.
> 
> Your VM usage is different from ours - you seem to need to clone and
> activate a VM quickly (like a vps provider might need to do).  We
> generally have to buy more RAM to add a new VM :-), so performance of
> creating a new LV is the least of our worries.

To put it mildly, yes :).  Ideally we could get VM boot time down to
100ms or lower.

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 21:39         ` Stuart D. Gathman
  2022-01-30 22:14           ` Demi Marie Obenour
@ 2022-01-31  7:47           ` Gionatan Danti
  1 sibling, 0 replies; 22+ messages in thread
From: Gionatan Danti @ 2022-01-31  7:47 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Zdenek Kabelac

Il 2022-01-30 22:39 Stuart D. Gathman ha scritto:
> I use LVM as flexible partitions (i.e. only classic LVs, no thin pool).
> Classic LVs perform like partitions, literally using the same driver
> (device mapper) with a small number of extents, and are if anything
> more recoverable than partition tables.  We used to put LVM on bare
> drives (like AIX did) - who needs a partition table?  But on Wintel,
> you need a partition table for EFI and so that alien operating systems
> know there is something already on a disk.

Classical (fat) LVs are rock solid, but how do you cope with fast (maybe 
rolling) snapshotting? This is the main selling point of thinlvm.

> Since we use LVs like partitions - mixing with btrfs is not an issue.
> Just use the LVs like partitions.  I haven't tried ZFS on linux - it
> may have LVM like features that could fight with LVM.  ZFS would be my
> first choice on a BSD box.

I broadly use ZFS - and yes, it is a wonderful tools. Than said, it has 
its own gotcha. For example:
- snapshot rollback is a destructive operation (ie: after rollback, you 
permanently lose the current filesystem state);
- clones (writable snapshots) depend on the read-only base image (ie: on 
the original snapshot), which you can not delete until you have its 
clones around.

Moreover, snapshotting/cloning a ZFS dataset (or volume) does not appear 
to be significantly faster then LVM - sometime it requires ~1s, 
depending on the load.

> We do not use LVM raid - but either run mdraid underneath, or let btrfs
> do it's data duplication thing with LVs on different spindles.

I always found btrfs very underperforming when facing random rewrite 
workloads as VMs and DBs. Can I ask your experience?
Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 21:17             ` Demi Marie Obenour
@ 2022-01-31  7:52               ` Gionatan Danti
  0 siblings, 0 replies; 22+ messages in thread
From: Gionatan Danti @ 2022-01-31  7:52 UTC (permalink / raw)
  To: Demi Marie Obenour; +Cc: LVM general discussion and development

Il 2022-01-30 22:17 Demi Marie Obenour ha scritto:
> On Xen, the paravirtualised block backend driver (blkback) requires a
> block device, so file-based virtual disks are implemented with a loop
> device managed by the toolstack.  Suggestions for improving this
> less-than-satisfactory situation are welcome.

Ah - I expected that with something as

disk = [ 'file:mydisk.img,hda,w' ]

Xen would have directly used "mydisk.img" as the backend disk file. Does 
it instead automatically create a loopback overlay?

I mainly use KVM, and maybe I am spoiled by its capability to use 
basically any datastore as backing disk.
Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 22:14           ` Demi Marie Obenour
@ 2022-01-31 21:29             ` Marian Csontos
  2022-02-03  4:48               ` Demi Marie Obenour
  0 siblings, 1 reply; 22+ messages in thread
From: Marian Csontos @ 2022-01-31 21:29 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1.1: Type: text/plain, Size: 3556 bytes --]

On Sun, Jan 30, 2022 at 11:17 PM Demi Marie Obenour <
demi@invisiblethingslab.com> wrote:

> On Sun, Jan 30, 2022 at 04:39:30PM -0500, Stuart D. Gathman wrote:
> > Your VM usage is different from ours - you seem to need to clone and
> > activate a VM quickly (like a vps provider might need to do).  We
> > generally have to buy more RAM to add a new VM :-), so performance of
> > creating a new LV is the least of our worries.
>
> To put it mildly, yes :).  Ideally we could get VM boot time down to
> 100ms or lower.
>

Out of curiosity, is snapshot creation the main culprit to boot a VM in
under 100ms? Does Qubes OS use tweaked linux distributions, to achieve the
desired boot time?

Back to business. Perhaps I missed an answer to this question: Are the
Qubes OS VMs throw away?  Throw away in the sense like many containers are
- it's just a runtime which can be "easily" reconstructed. If so, you can
ignore the safety belts and try to squeeze more performance by sacrificing
(meta)data integrity.

And the answer to that question seems to be both Yes and No. Classical pets
vs cattle.

As I understand it, except of the system VMs, there are at least two kinds
of user domains and these have different requirements:

1. few permanent pet VMs (Work, Personal, Banking, ...), in Qubes OS called
AppVMs,
2. and many transient cattle VMs (e.g. for opening an attachment from
email, or browsing web, or batch processing of received files) called
Disposable VMs.

For AppVMs, there are only "few" of those and these are running most of the
time so start time may be less important than data safety. Certainly
creation time is only once in a while operation so I would say use LVM for
these. And where snapshots are not required, use plain linear LVs, one less
thing which could go wrong. However, AppVMs are created from Template VMs,
so snapshots seem to be part of the system. But data may be on linear LVs
anyway as these are not shared and these are the most important part of the
system. And you can still use old style snapshots for backing up the data
(and by backup I mean snapshot, copy, delete snapshot. Not a long term
snapshot. And definitely not multiple snapshots).

Now I realized there is the third kind of user domains - Template VMs.
Similarly to App VM, there are only few of those, and creating them
requires downloading an image, upgrading system on an existing template, or
even installation of the system, so any LVM overhead is insignificant for
these. Use thin volumes.

For the Disposable VMs it is the creation + startup time which matters. Use
whatever is the fastest method. These are created from template VMs too.
What LVM/DM has to offer here is external origin. So the templates
themselves could be managed by LVM, and Qubes OS could use them as external
origin for Disposable VMs using device mapper directly. These could be held
in a disposable thin pool which can be reinitialized from scratch on host
reboot, after a crash, or on a problem with the pool. As a bonus this would
also address the absence of thin pool shrinking.

I wonder if a pool of ready to be used VMs could solve some of the startup
time issues - keep $POOL_SIZE VMs (all using LVM) ready and just inject the
data to one of the VMs when needed and prepare a new one asynchronously. So
you could have to some extent both the quick start and data safety as a
solution for the hypothetical third kind of domains requiring them - e.g. a
Disposable VM spawn to edit a file from a third party - you want to keep
the state on a reboot or a system crash.

[-- Attachment #1.2: Type: text/html, Size: 4213 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-30 17:43         ` Zdenek Kabelac
  2022-01-30 20:27           ` Gionatan Danti
@ 2022-02-02  2:09           ` Demi Marie Obenour
  2022-02-02 10:04             ` Zdenek Kabelac
  1 sibling, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-02-02  2:09 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 8168 bytes --]

On Sun, Jan 30, 2022 at 06:43:13PM +0100, Zdenek Kabelac wrote:
> Dne 30. 01. 22 v 17:45 Demi Marie Obenour napsal(a):
> > On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
> > > Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
> > > > On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
> > > > > Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
> > > > > > How much slower are operations on an LVM2 thin pool compared to manually
> > > > > > managing a dm-thin target via ioctls?  I am mostly concerned about
> > > > > > volume snapshot, creation, and destruction.  Data integrity is very
> > > > > > important, so taking shortcuts that risk data loss is out of the
> > > > > > question.  However, the application may have some additional information
> > > > > > that LVM2 does not have.  For instance, it may know that the volume that
> > > > > > it is snapshotting is not in use, or that a certain volume it is
> > > > > > creating will never be used after power-off.
> > > > > > 
> > > > 
> > > > > So brave developers may always write their own management tools for their
> > > > > constrained environment requirements that will by significantly faster in
> > > > > terms of how many thins you could create per minute (btw you will need to
> > > > > also consider dropping usage of udev on such system)
> > > > 
> > > > What kind of constraints are you referring to?  Is it possible and safe
> > > > to have udev running, but told to ignore the thins in question?
> > > 
> > > Lvm2 is oriented more towards managing set of different disks,
> > > where user is adding/removing/replacing them.  So it's more about
> > > recoverability, good support for manual repair  (ascii metadata),
> > > tracking history of changes,  backward compatibility, support
> > > of conversion to different volume types (i.e. caching of thins, pvmove...)
> > > Support for no/udev & no/systemd, clusters and nearly every linux distro
> > > available... So there is a lot - and this all adds quite complexity.
> > 
> > I am certain it does, and that makes a lot of sense.  Thanks for the
> > hard work!  Those features are all useful for Qubes OS, too — just not
> > in the VM startup/shutdown path.
> > 
> > > So once you scratch all this - and you say you only care about single disc
> > > then you are able to use more efficient metadata formats which you could
> > > even keep permanently in memory during the lifetime - this all adds great
> > > performance.
> > > 
> > > But it all depends how you could constrain your environment.
> > > 
> > > It's worth to mention there is lvm2 support for 'external' 'thin volume'
> > > creators - so lvm2 only maintains 'thin-pool' data & metadata LV - but thin
> > > volume creation, activation, deactivation of thins is left to external tool.
> > > This has been used by docker for a while - later on they switched to
> > > overlayFs I believe..
> > 
> > That indeeds sounds like a good choice for Qubes OS.  It would allow the
> > data and metadata LVs to be any volume type that lvm2 supports, and
> > managed using all of lvm2’s features.  So one could still put the
> > metadata on a RAID-10 volume while everything else is RAID-6, or set up
> > a dm-cache volume to store the data (please correct me if I am wrong).
> > Qubes OS has already moved to using a separate thin pool for virtual
> > machines, as it prevents dom0 (privileged management VM) from being run
> > out of disk space (by accident or malice).  That means that the thin
> > pool use for guests is managed only by Qubes OS, and so the standard
> > lvm2 tools do not need to touch it.
> > 
> > Is this a setup that you would recommend, and would be comfortable using
> > in production?  As far as metadata is concerned, Qubes OS has its own
> > XML file containing metadata about all qubes, which should suffice for
> > this purpose.  To prevent races during updates and ensure automatic
> > crash recovery, is it sufficient to store metadata for both new and old
> > transaction IDs, and pick the correct one based on the device-mapper
> > status line?  I have seen lvm2 get in an inconsistent state (transaction
> > ID off by one) that required manual repair before, which is quite
> > unnerving for a desktop OS.
> 
> My biased advice would be to stay with lvm2. There is lot of work, many
> things are not well documented and getting everything running correctly will
> take a lot of effort  (Docker in fact did not managed to do it well and was
> incapable to provide any recoverability)

What did Docker do wrong?  Would it be possible for a future version of
lvm2 to be able to automatically recover from off-by-one thin pool
transaction IDs?

> > One feature that would be nice is to be able to import an
> > externally-provided mapping of thin pool device numbers to LV names, so
> > that lvm2 could provide a (read-only, and not guaranteed fresh) view of
> > system state for reporting purposes.
> 
> Once you will have evidence it's the lvm2 causing major issue - you could
> consider whether it's worth to step into a separate project.

Agreed.

> > > > > It's worth to mention - the more bullet-proof you will want to make your
> > > > > project - the more closer to the extra processing made by lvm2 you will get.
> > > > 
> > > > Why is this?  How does lvm2 compare to stratis, for example?
> > > 
> > > Stratis is yet another volume manager written in Rust combined with XFS for
> > > easier user experience. That's all I'd probably say about it...
> > 
> > That’s fine.  I guess my question is why making lvm2 bullet-proof needs
> > so much overhead.
> 
> It's difficult - if you would be distributing lvm2 with exact kernel version
> & udev & systemd with a single linux distro - it reduces huge set of
> troubles...

Qubes OS comes close to this in practice.  systemd and udev versions are
known and fixed, and Qubes OS ships its own kernels.

> > > > > However before you will step into these waters - you should probably
> > > > > evaluate whether thin-pool actually meet your needs if you have that high
> > > > > expectation for number of supported volumes - so you will not end up with
> > > > > hyper fast snapshot creation while the actual usage then is not meeting your
> > > > > needs...
> > > > 
> > > > What needs are you thinking of specifically?  Qubes OS needs block
> > > > devices, so filesystem-backed storage would require the use of loop
> > > > devices unless I use ZFS zvols.  Do you have any specific
> > > > recommendations?
> > > 
> > > As long as you live in the world without crashes, buggy kernels, apps  and
> > > failing hard drives everything looks very simple.
> > 
> > Would you mind explaining further?  LVM2 RAID and cache volumes should
> > provide most of the benefits that Qubes OS desires, unless I am missing
> > something.
> 
> I'm not familiar with QubesOS - but in many cases in real-life world we
> can't push to our users latest&greatest - so we need to live with bugs and
> add workarounds...

Qubes OS is more than capable of shipping fixes for kernel bugs.  Is
that what you are referring to?

> > > Since you mentioned ZFS - you might want focus on using 'ZFS-only' solution.
> > > Combining  ZFS or Btrfs with lvm2 is always going to be a painful way as
> > > those filesystems have their own volume management.
> > 
> > Absolutely!  That said, I do wonder what your thoughts on using loop
> > devices for VM storage are.  I know they are slower than thin volumes,
> > but they are also much easier to manage, since they are just ordinary
> > disk files.  Any filesystem with reflink can provide the needed
> > copy-on-write support.
> 
> Chain filesystem->block_layer->filesystem->block_layer is something you most
> likely do not want to use for any well performing solution...
> But it's ok for testing...

How much of this is due to the slow loop driver?  How much of it could
be mitigated if btrfs supported an equivalent of zvols?

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-02  2:09           ` Demi Marie Obenour
@ 2022-02-02 10:04             ` Zdenek Kabelac
  2022-02-03  0:23               ` Demi Marie Obenour
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2022-02-02 10:04 UTC (permalink / raw)
  To: Demi Marie Obenour; +Cc: LVM general discussion and development

Dne 02. 02. 22 v 3:09 Demi Marie Obenour napsal(a):
> On Sun, Jan 30, 2022 at 06:43:13PM +0100, Zdenek Kabelac wrote:
>> Dne 30. 01. 22 v 17:45 Demi Marie Obenour napsal(a):
>>> On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
>>>> Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
>>>>> On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
>>>>>> Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
>> My biased advice would be to stay with lvm2. There is lot of work, many
>> things are not well documented and getting everything running correctly will
>> take a lot of effort  (Docker in fact did not managed to do it well and was
>> incapable to provide any recoverability)
> 
> What did Docker do wrong?  Would it be possible for a future version of
> lvm2 to be able to automatically recover from off-by-one thin pool
> transaction IDs?

Ensuring all steps in state-machine are always correct is not exactly simple.
But since I've not heard about off-by-one problem for a long while -  I 
believe we've managed to close all the holes and bugs in double-commit system
and metadata handling by thin-pool and lvm2.... (for recent lvm2 & kernel)

>> It's difficult - if you would be distributing lvm2 with exact kernel version
>> & udev & systemd with a single linux distro - it reduces huge set of
>> troubles...
> 
> Qubes OS comes close to this in practice.  systemd and udev versions are
> known and fixed, and Qubes OS ships its own kernels.

Systemd/udev evolves - so fixed today doesn't really mean same version will be 
there tomorrow.  And unfortunately systemd is known to introduce  backward 
incompatible changes from time to time...

>> I'm not familiar with QubesOS - but in many cases in real-life world we
>> can't push to our users latest&greatest - so we need to live with bugs and
>> add workarounds...
> 
> Qubes OS is more than capable of shipping fixes for kernel bugs.  Is
> that what you are referring to?
not going to starting discussing this topic ;)

>> Chain filesystem->block_layer->filesystem->block_layer is something you most
>> likely do not want to use for any well performing solution...
>> But it's ok for testing...
> 
> How much of this is due to the slow loop driver?  How much of it could
> be mitigated if btrfs supported an equivalent of zvols?

Here you are missing the core of problem from kernel POV aka
how the memory allocation is working and what are the approximation in kernel 
with buffer handling and so on.
So whoever is using  'loop' devices in production systems in the way described 
above has never really tested any corner case logic....

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-02 10:04             ` Zdenek Kabelac
@ 2022-02-03  0:23               ` Demi Marie Obenour
  2022-02-03 12:04                 ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-02-03  0:23 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 3111 bytes --]

On Wed, Feb 02, 2022 at 11:04:37AM +0100, Zdenek Kabelac wrote:
> Dne 02. 02. 22 v 3:09 Demi Marie Obenour napsal(a):
> > On Sun, Jan 30, 2022 at 06:43:13PM +0100, Zdenek Kabelac wrote:
> > > Dne 30. 01. 22 v 17:45 Demi Marie Obenour napsal(a):
> > > > On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
> > > > > Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
> > > > > > On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
> > > > > > > Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):
> > > My biased advice would be to stay with lvm2. There is lot of work, many
> > > things are not well documented and getting everything running correctly will
> > > take a lot of effort  (Docker in fact did not managed to do it well and was
> > > incapable to provide any recoverability)
> > 
> > What did Docker do wrong?  Would it be possible for a future version of
> > lvm2 to be able to automatically recover from off-by-one thin pool
> > transaction IDs?
> 
> Ensuring all steps in state-machine are always correct is not exactly simple.
> But since I've not heard about off-by-one problem for a long while -  I
> believe we've managed to close all the holes and bugs in double-commit
> system
> and metadata handling by thin-pool and lvm2.... (for recent lvm2 & kernel)

How recent are you talking about?  Are there fixes that can be
cherry-picked?  I somewhat recently triggered this issue on a test
machine, so I would like to know.

> > > It's difficult - if you would be distributing lvm2 with exact kernel version
> > > & udev & systemd with a single linux distro - it reduces huge set of
> > > troubles...
> > 
> > Qubes OS comes close to this in practice.  systemd and udev versions are
> > known and fixed, and Qubes OS ships its own kernels.
> 
> Systemd/udev evolves - so fixed today doesn't really mean same version will
> be there tomorrow.  And unfortunately systemd is known to introduce
> backward incompatible changes from time to time...

Thankfully, in Qubes OS’s dom0, the version of systemd is frozen and
will never change throughout an entire release.

> > > Chain filesystem->block_layer->filesystem->block_layer is something you most
> > > likely do not want to use for any well performing solution...
> > > But it's ok for testing...
> > 
> > How much of this is due to the slow loop driver?  How much of it could
> > be mitigated if btrfs supported an equivalent of zvols?
> 
> Here you are missing the core of problem from kernel POV aka
> how the memory allocation is working and what are the approximation in
> kernel with buffer handling and so on.
> So whoever is using  'loop' devices in production systems in the way
> described above has never really tested any corner case logic....

In Qubes OS the loop device is always passed through to a VM or used as
the base device for an old-style device-mapper snapshot.  It is never
mounted on the host.  Are there known problems with either of these
configurations?

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-01-31 21:29             ` Marian Csontos
@ 2022-02-03  4:48               ` Demi Marie Obenour
  2022-02-03 12:28                 ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-02-03  4:48 UTC (permalink / raw)
  To: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 5827 bytes --]

On Mon, Jan 31, 2022 at 10:29:04PM +0100, Marian Csontos wrote:
> On Sun, Jan 30, 2022 at 11:17 PM Demi Marie Obenour <
> demi@invisiblethingslab.com> wrote:
> 
>> On Sun, Jan 30, 2022 at 04:39:30PM -0500, Stuart D. Gathman wrote:
>>> Your VM usage is different from ours - you seem to need to clone and
>>> activate a VM quickly (like a vps provider might need to do).  We
>>> generally have to buy more RAM to add a new VM :-), so performance of
>>> creating a new LV is the least of our worries.
>>
>> To put it mildly, yes :).  Ideally we could get VM boot time down to
>> 100ms or lower.
>>
> 
> Out of curiosity, is snapshot creation the main culprit to boot a VM in
> under 100ms? Does Qubes OS use tweaked linux distributions, to achieve the
> desired boot time?

The goal is 100ms from user action until PID 1 starts in the guest.
After that, it’s the job of whatever distro the guest is running.
Storage management is one area that needs to be optimized to achieve
this, though it is not the only one.

> Back to business. Perhaps I missed an answer to this question: Are the
> Qubes OS VMs throw away?  Throw away in the sense like many containers are
> - it's just a runtime which can be "easily" reconstructed. If so, you can
> ignore the safety belts and try to squeeze more performance by sacrificing
> (meta)data integrity.

Why does a trade-off need to be made here?  More specifically, why is it
not possible to be reasonably fast (a few ms) AND safe?

> And the answer to that question seems to be both Yes and No. Classical pets
> vs cattle.
> 
> As I understand it, except of the system VMs, there are at least two kinds
> of user domains and these have different requirements:
> 
> 1. few permanent pet VMs (Work, Personal, Banking, ...), in Qubes OS called
> AppVMs,
> 2. and many transient cattle VMs (e.g. for opening an attachment from
> email, or browsing web, or batch processing of received files) called
> Disposable VMs.
> 
> For AppVMs, there are only "few" of those and these are running most of the
> time so start time may be less important than data safety. Certainly
> creation time is only once in a while operation so I would say use LVM for
> these. And where snapshots are not required, use plain linear LVs, one less
> thing which could go wrong. However, AppVMs are created from Template VMs,
> so snapshots seem to be part of the system.

Snapshots are used and required *everywhere*.  Qubes OS offers
copy-on-write cloning support, and users expect it to be cheap, not
least because renaming a qube is implemented using it.  By default,
AppVM private and TemplateVM root volumes always have at least one
snapshot, to support `qvm-volume revert`.  Start time really matters
too; a user may not wish to have every qube running at once.

In short, performance and safety *both* matter, and data AND metadata
operations are performance-critical.

> But data may be on linear LVs
> anyway as these are not shared and these are the most important part of the
> system. And you can still use old style snapshots for backing up the data
> (and by backup I mean snapshot, copy, delete snapshot. Not a long term
> snapshot. And definitely not multiple snapshots).

Creating a qube is intended to be a cheap operation, so thin
provisioning of storage is required.  Qubes OS also relies heavily
on over-provisioning of storage, so linear LVs and old style snapshots
won’t fly.  Qubes OS does have a storage driver that uses dm-snapshot on
top of loop devices, but that is deprecated, since it cannot provide the
features Qubes OS requires.  As just one example, the default private
volume size is 2GiB, but many qubes use nowhere near this amount of disk
space.

> Now I realized there is the third kind of user domains - Template VMs.
> Similarly to App VM, there are only few of those, and creating them
> requires downloading an image, upgrading system on an existing template, or
> even installation of the system, so any LVM overhead is insignificant for
> these. Use thin volumes.
> 
> For the Disposable VMs it is the creation + startup time which matters. Use
> whatever is the fastest method. These are created from template VMs too.
> What LVM/DM has to offer here is external origin. So the templates
> themselves could be managed by LVM, and Qubes OS could use them as external
> origin for Disposable VMs using device mapper directly. These could be held
> in a disposable thin pool which can be reinitialized from scratch on host
> reboot, after a crash, or on a problem with the pool. As a bonus this would
> also address the absence of thin pool shrinking.

That is an interesting idea I had not considered, but it would add
substantial complexity to the storage management system.  More
generally, the same approach could be used for all volatile volumes,
which are intended to be thrown away after qube shutdown.  Qubes OS even
supports encrypting volatile volumes with an ephemeral key to guarantee
they are unrecoverable.  (Disposable VM private volumes should support
this, but currently do not.)

> I wonder if a pool of ready to be used VMs could solve some of the startup
> time issues - keep $POOL_SIZE VMs (all using LVM) ready and just inject the
> data to one of the VMs when needed and prepare a new one asynchronously. So
> you could have to some extent both the quick start and data safety as a
> solution for the hypothetical third kind of domains requiring them - e.g. a
> Disposable VM spawn to edit a file from a third party - you want to keep
> the state on a reboot or a system crash.

That is also a good idea, but it is orthoganal to which storage driver
is in use.

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-03  0:23               ` Demi Marie Obenour
@ 2022-02-03 12:04                 ` Zdenek Kabelac
  2022-02-03 12:04                   ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2022-02-03 12:04 UTC (permalink / raw)
  To: LVM general discussion and development, Demi Marie Obenour

Dne 03. 02. 22 v 1:23 Demi Marie Obenour napsal(a):
> On Wed, Feb 02, 2022 at 11:04:37AM +0100, Zdenek Kabelac wrote:
>> Dne 02. 02. 22 v 3:09 Demi Marie Obenour napsal(a):
>>> On Sun, Jan 30, 2022 at 06:43:13PM +0100, Zdenek Kabelac wrote:
>>>> Dne 30. 01. 22 v 17:45 Demi Marie Obenour napsal(a):
>>>>> On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
>>>>>> Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
>>>>>>> On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
>>>>>>>> Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):

>> Ensuring all steps in state-machine are always correct is not exactly simple.
>> But since I've not heard about off-by-one problem for a long while -  I
>> believe we've managed to close all the holes and bugs in double-commit
>> system
>> and metadata handling by thin-pool and lvm2.... (for recent lvm2 & kernel)
> 
> How recent are you talking about?  Are there fixes that can be
> cherry-picked?  I somewhat recently triggered this issue on a test
> machine, so I would like to know.

I'd avoid cherry-picking unless you have deep knowledge about all connections 
between patches.
Always use the latest released kernel for your comments whether things are 
slow or fast.

>> Here you are missing the core of problem from kernel POV aka
>> how the memory allocation is working and what are the approximation in
>> kernel with buffer handling and so on.
>> So whoever is using  'loop' devices in production systems in the way
>> described above has never really tested any corner case logic....
> 
> In Qubes OS the loop device is always passed through to a VM or used as
> the base device for an old-style device-mapper snapshot.  It is never
> mounted on the host.  Are there known problems with either of these
> configurations?
> 

Inefficient design - you should prefer to pass devices directly.
AKA you might have some benefits at creation times, but overall performance of 
VM will be lower during its actual usage... pick your poison...

Especially if the backend is made by NVMe - any layer adds tremendous amount 
of latencies (and in fact DM alone is also noticeable)...

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-03 12:04                 ` Zdenek Kabelac
@ 2022-02-03 12:04                   ` Zdenek Kabelac
  0 siblings, 0 replies; 22+ messages in thread
From: Zdenek Kabelac @ 2022-02-03 12:04 UTC (permalink / raw)
  To: linux-lvm

Dne 03. 02. 22 v 1:23 Demi Marie Obenour napsal(a):
> On Wed, Feb 02, 2022 at 11:04:37AM +0100, Zdenek Kabelac wrote:
>> Dne 02. 02. 22 v 3:09 Demi Marie Obenour napsal(a):
>>> On Sun, Jan 30, 2022 at 06:43:13PM +0100, Zdenek Kabelac wrote:
>>>> Dne 30. 01. 22 v 17:45 Demi Marie Obenour napsal(a):
>>>>> On Sun, Jan 30, 2022 at 11:52:52AM +0100, Zdenek Kabelac wrote:
>>>>>> Dne 30. 01. 22 v 1:32 Demi Marie Obenour napsal(a):
>>>>>>> On Sat, Jan 29, 2022 at 10:32:52PM +0100, Zdenek Kabelac wrote:
>>>>>>>> Dne 29. 01. 22 v 21:34 Demi Marie Obenour napsal(a):

>> Ensuring all steps in state-machine are always correct is not exactly simple.
>> But since I've not heard about off-by-one problem for a long while -  I
>> believe we've managed to close all the holes and bugs in double-commit
>> system
>> and metadata handling by thin-pool and lvm2.... (for recent lvm2 & kernel)
> 
> How recent are you talking about?  Are there fixes that can be
> cherry-picked?  I somewhat recently triggered this issue on a test
> machine, so I would like to know.

I'd avoid cherry-picking unless you have deep knowledge about all connections 
between patches.
Always use the latest released kernel for your comments whether things are 
slow or fast.

>> Here you are missing the core of problem from kernel POV aka
>> how the memory allocation is working and what are the approximation in
>> kernel with buffer handling and so on.
>> So whoever is using  'loop' devices in production systems in the way
>> described above has never really tested any corner case logic....
> 
> In Qubes OS the loop device is always passed through to a VM or used as
> the base device for an old-style device-mapper snapshot.  It is never
> mounted on the host.  Are there known problems with either of these
> configurations?
> 

Inefficient design - you should prefer to pass devices directly.
AKA you might have some benefits at creation times, but overall performance of 
VM will be lower during its actual usage... pick your poison...

Especially if the backend is made by NVMe - any layer adds tremendous amount 
of latencies (and in fact DM alone is also noticeable)...

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-03  4:48               ` Demi Marie Obenour
@ 2022-02-03 12:28                 ` Zdenek Kabelac
  2022-02-04  0:01                   ` Demi Marie Obenour
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2022-02-03 12:28 UTC (permalink / raw)
  To: LVM general discussion and development, Demi Marie Obenour

Dne 03. 02. 22 v 5:48 Demi Marie Obenour napsal(a):
> On Mon, Jan 31, 2022 at 10:29:04PM +0100, Marian Csontos wrote:
>> On Sun, Jan 30, 2022 at 11:17 PM Demi Marie Obenour <
>> demi@invisiblethingslab.com> wrote:
>>
>>> On Sun, Jan 30, 2022 at 04:39:30PM -0500, Stuart D. Gathman wrote:
>>>> Your VM usage is different from ours - you seem to need to clone and
>>>> activate a VM quickly (like a vps provider might need to do).  We
>>>> generally have to buy more RAM to add a new VM :-), so performance of
>>>> creating a new LV is the least of our worries.
>>>
>>> To put it mildly, yes :).  Ideally we could get VM boot time down to
>>> 100ms or lower.
>>>
>>
>> Out of curiosity, is snapshot creation the main culprit to boot a VM in
>> under 100ms? Does Qubes OS use tweaked linux distributions, to achieve the
>> desired boot time?
> 
> The goal is 100ms from user action until PID 1 starts in the guest.
> After that, it’s the job of whatever distro the guest is running.
> Storage management is one area that needs to be optimized to achieve
> this, though it is not the only one.

I'm wondering from where those 100ms came from?

Users often mistakenly target for wrong technologies for their tasks.

If they need to use containerized software they should use containers like 
i.e. Docker - if they need full virtual secure machine - it certainly has it's 
price (mainly way higher memory consumption)
I've some doubts there is some real good reason to have quickly created VMs as 
they surely are supposed to be a long time living entities  (hours/days...)

So unless you want to create something for marketing purposes aka - my table 
is bigger then yours - I don't see the point.

For quick instancies of software apps I'd always recommend containers - which 
are vastly more efficient and scalable.

VMs and containers have its strength and weaknesses..
Not sure why some many people try to pretend VMs can be as efficient as 
containers or containers as secure as VMs. Just always pick the right tool...

>> Back to business. Perhaps I missed an answer to this question: Are the
>> Qubes OS VMs throw away?  Throw away in the sense like many containers are
>> - it's just a runtime which can be "easily" reconstructed. If so, you can
>> ignore the safety belts and try to squeeze more performance by sacrificing
>> (meta)data integrity.
> 
> Why does a trade-off need to be made here?  More specifically, why is it
> not possible to be reasonably fast (a few ms) AND safe?

Security, safety and determinism always takes away efficiency.

The higher amount of randomness you can live with, the faster processing you 
can achieve - you just need to cross you fingers :)
(i.e. drop transaction synchornisation :))

Quite frankly - if you are orchestrating mostly same VMs, it would be more 
efficient, to just snapshot them with already running memory environment -
so instead of booting VM always from 'scratch', you restore/resume those VMs 
at some already running point - from which it could start deviate.
Why wasting CPU&time on processing over and over same boot....
There you should hunt your miliseconds...

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-03 12:28                 ` Zdenek Kabelac
@ 2022-02-04  0:01                   ` Demi Marie Obenour
  2022-02-04 10:16                     ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Demi Marie Obenour @ 2022-02-04  0:01 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Marek Marczykowski-Górecki


[-- Attachment #1.1: Type: text/plain, Size: 4336 bytes --]

On Thu, Feb 03, 2022 at 01:28:37PM +0100, Zdenek Kabelac wrote:
> Dne 03. 02. 22 v 5:48 Demi Marie Obenour napsal(a):
> > On Mon, Jan 31, 2022 at 10:29:04PM +0100, Marian Csontos wrote:
> > > On Sun, Jan 30, 2022 at 11:17 PM Demi Marie Obenour <
> > > demi@invisiblethingslab.com> wrote:
> > > 
> > > > On Sun, Jan 30, 2022 at 04:39:30PM -0500, Stuart D. Gathman wrote:
> > > > > Your VM usage is different from ours - you seem to need to clone and
> > > > > activate a VM quickly (like a vps provider might need to do).  We
> > > > > generally have to buy more RAM to add a new VM :-), so performance of
> > > > > creating a new LV is the least of our worries.
> > > > 
> > > > To put it mildly, yes :).  Ideally we could get VM boot time down to
> > > > 100ms or lower.
> > > > 
> > > 
> > > Out of curiosity, is snapshot creation the main culprit to boot a VM in
> > > under 100ms? Does Qubes OS use tweaked linux distributions, to achieve the
> > > desired boot time?
> > 
> > The goal is 100ms from user action until PID 1 starts in the guest.
> > After that, it’s the job of whatever distro the guest is running.
> > Storage management is one area that needs to be optimized to achieve
> > this, though it is not the only one.
> 
> I'm wondering from where those 100ms came from?
> 
> Users often mistakenly target for wrong technologies for their tasks.
> 
> If they need to use containerized software they should use containers like
> i.e. Docker - if they need full virtual secure machine - it certainly has
> it's price (mainly way higher memory consumption)
> I've some doubts there is some real good reason to have quickly created VMs
> as they surely are supposed to be a long time living entities
> (hours/days...)

Simply put, Qubes OS literally does not have a choice.  Qubes OS is
intended to protect against very high-level attackers who are likely to
have 0day exploits against the Linux kernel.  And it is trying to do the
best possible given that constraint.  A microkernel *could* provide
sufficiently strong isolation, but there are none that have sufficiently
broad hardware support and sufficiently capable userlands.

In the long term, I would like to use unikernels for at least some of
the VMs.  Unikernels can start up so quickly that the largest overhead
is the hypervisor’s toolstack.  But that is very much off-topic.

> So unless you want to create something for marketing purposes aka - my table
> is bigger then yours - I don't see the point.
> 
> For quick instancies of software apps I'd always recommend containers -
> which are vastly more efficient and scalable.
> 
> VMs and containers have its strength and weaknesses..
> Not sure why some many people try to pretend VMs can be as efficient as
> containers or containers as secure as VMs. Just always pick the right
> tool...

Qubes OS needs secure *and* fast.  To quote the seL4 microkernel’s
mantra, “Security is no excuse for poor performance!”.

> > > Back to business. Perhaps I missed an answer to this question: Are the
> > > Qubes OS VMs throw away?  Throw away in the sense like many containers are
> > > - it's just a runtime which can be "easily" reconstructed. If so, you can
> > > ignore the safety belts and try to squeeze more performance by sacrificing
> > > (meta)data integrity.
> > 
> > Why does a trade-off need to be made here?  More specifically, why is it
> > not possible to be reasonably fast (a few ms) AND safe?
> 
> Security, safety and determinism always takes away efficiency.
> 
> The higher amount of randomness you can live with, the faster processing you
> can achieve - you just need to cross you fingers :)
> (i.e. drop transaction synchornisation :))
> 
> Quite frankly - if you are orchestrating mostly same VMs, it would be more
> efficient, to just snapshot them with already running memory environment -
> so instead of booting VM always from 'scratch', you restore/resume those VMs
> at some already running point - from which it could start deviate.
> Why wasting CPU&time on processing over and over same boot....
> There you should hunt your miliseconds...

Qubes OS used to do that, but it was a significant maintenance burden.

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] LVM performance vs direct dm-thin
  2022-02-04  0:01                   ` Demi Marie Obenour
@ 2022-02-04 10:16                     ` Zdenek Kabelac
  0 siblings, 0 replies; 22+ messages in thread
From: Zdenek Kabelac @ 2022-02-04 10:16 UTC (permalink / raw)
  To: LVM general discussion and development, Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki

Dne 04. 02. 22 v 1:01 Demi Marie Obenour napsal(a):
> On Thu, Feb 03, 2022 at 01:28:37PM +0100, Zdenek Kabelac wrote:
>> Dne 03. 02. 22 v 5:48 Demi Marie Obenour napsal(a):
>>> On Mon, Jan 31, 2022 at 10:29:04PM +0100, Marian Csontos wrote:
>>>> On Sun, Jan 30, 2022 at 11:17 PM Demi Marie Obenour <
>>>> demi@invisiblethingslab.com> wrote:
>>>>
>>>>> On Sun, Jan 30, 2022 at 04:39:30PM -0500, Stuart D. Gathman wrote:

>> If they need to use containerized software they should use containers like
>> i.e. Docker - if they need full virtual secure machine - it certainly has
>> it's price (mainly way higher memory consumption)
>> I've some doubts there is some real good reason to have quickly created VMs
>> as they surely are supposed to be a long time living entities
>> (hours/days...)
> 
> Simply put, Qubes OS literally does not have a choice.  Qubes OS is
> intended to protect against very high-level attackers who are likely to

I'd say you are putting your effort into wrong place then.
AKA you effort placed in optimizing given chang is no where near to using 
things properly...

>> VMs and containers have its strength and weaknesses..
>> Not sure why some many people try to pretend VMs can be as efficient as
>> containers or containers as secure as VMs. Just always pick the right
>> tool...
> 
> Qubes OS needs secure *and* fast.  To quote the seL4 microkernel’s
> mantra, “Security is no excuse for poor performance!”.

And who ever tells you he can get the same performance for VM as with 
container has no idea how OS works...

Security simply *IS* expensive (especially with Intel CPUs ;))

Educated user needs to pick the level he wants to pay for it.

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-02-04 10:20 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-29 20:34 [linux-lvm] LVM performance vs direct dm-thin Demi Marie Obenour
2022-01-29 21:32 ` Zdenek Kabelac
2022-01-30  0:32   ` Demi Marie Obenour
2022-01-30 10:52     ` Zdenek Kabelac
2022-01-30 16:45       ` Demi Marie Obenour
2022-01-30 17:43         ` Zdenek Kabelac
2022-01-30 20:27           ` Gionatan Danti
2022-01-30 21:17             ` Demi Marie Obenour
2022-01-31  7:52               ` Gionatan Danti
2022-02-02  2:09           ` Demi Marie Obenour
2022-02-02 10:04             ` Zdenek Kabelac
2022-02-03  0:23               ` Demi Marie Obenour
2022-02-03 12:04                 ` Zdenek Kabelac
2022-02-03 12:04                   ` Zdenek Kabelac
2022-01-30 21:39         ` Stuart D. Gathman
2022-01-30 22:14           ` Demi Marie Obenour
2022-01-31 21:29             ` Marian Csontos
2022-02-03  4:48               ` Demi Marie Obenour
2022-02-03 12:28                 ` Zdenek Kabelac
2022-02-04  0:01                   ` Demi Marie Obenour
2022-02-04 10:16                     ` Zdenek Kabelac
2022-01-31  7:47           ` Gionatan Danti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).