* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-14 20:40 ` David Teigland
@ 2020-02-15 5:22 ` heming.zhao
2020-02-15 12:40 ` Zdenek Kabelac
2020-02-15 19:07 ` Gionatan Danti
2 siblings, 0 replies; 12+ messages in thread
From: heming.zhao @ 2020-02-15 5:22 UTC (permalink / raw)
To: David Teigland, Gionatan Danti; +Cc: linux-lvm
Hello David,
I accept your points. the commit c527a0cbfc3 is correct.
I still not sure if the correct fix would have unintended consequences.
I think the most of people should only config device/filter in their machine.
After this commit, machine with duplicated devs should config 2 same copies filter rule.
one copy for device/filter, another for device/global_filter. It is wield.
The legacy code lived a period of time. Many of machine use it.
I quickly check the codes, before c527a0cbfc3, lvmetad_filter is
mainly used in _pvscan_cache() with in seldom condition. All other cases use device/filter.
So I suggest
1. Does lvm2 continue to keep the wrong filter(full_filter) usage?
It can keep machine running as usual.
or may add a new config item (e.g. pvscan_compat_filter = 0|1). let user to choose
filter behaviour
2. (a lot of work) backport mainline one filter code into stable-2.02 branch.
At last,
There is a little code tip for mainline branch:
To remove the in cfg_array(devices_global_filter_CFG in lib/config/config_settings.h
It generates useless config info.
Thanks.
On 2/15/20 4:40 AM, David Teigland wrote:
> On Fri, Feb 14, 2020 at 08:34:19PM +0100, Gionatan Danti wrote:
>> Hi David, being filters one of the most asked questions, can I ask why we
>> have so many different filters, leading to such complex interactions and
>> behaviors?
>>
>> Don't get me wrong: I am sure you (the lvm team) have very good reasons to
>> do that, and I am surely missing something? But what, precisely? How should
>> we (end users) consider filters? Should we only use global_filter?
>
> You're right, filters are difficult to understand and use correctly. The
> complexity and confusion in the code is no better. With the removal of
> lvmetad in 2.03 versions (e.g. RHEL8) there's no difference between filter
> and global_filter, so that's some small improvement. But, I think filters
> should be replaced or overhauled with something easier to use and more
> useful at a technical level.
>
> I've created a bz about that and welcome thoughts about what a replacement
> should or should not be like. With input the work is more likely to be
> prioritized.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1803266
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-14 20:40 ` David Teigland
2020-02-15 5:22 ` heming.zhao
@ 2020-02-15 12:40 ` Zdenek Kabelac
2020-02-15 19:15 ` Gionatan Danti
2020-02-15 19:07 ` Gionatan Danti
2 siblings, 1 reply; 12+ messages in thread
From: Zdenek Kabelac @ 2020-02-15 12:40 UTC (permalink / raw)
To: LVM general discussion and development, David Teigland, Gionatan Danti
Cc: heming.zhao
Dne 14. 02. 20 v 21:40 David Teigland napsal(a):
> On Fri, Feb 14, 2020 at 08:34:19PM +0100, Gionatan Danti wrote:
>> Hi David, being filters one of the most asked questions, can I ask why we
>> have so many different filters, leading to such complex interactions and
>> behaviors?
>>
>> Don't get me wrong: I am sure you (the lvm team) have very good reasons to
>> do that, and I am surely missing something? But what, precisely? How should
>> we (end users) consider filters? Should we only use global_filter?
>
> You're right, filters are difficult to understand and use correctly. The
> complexity and confusion in the code is no better. With the removal of
> lvmetad in 2.03 versions (e.g. RHEL8) there's no difference between filter
> and global_filter, so that's some small improvement. But, I think filters
> should be replaced or overhauled with something easier to use and more
> useful at a technical level.
>
> I've created a bz about that and welcome thoughts about what a replacement
> should or should not be like. With input the work is more likely to be
> prioritized.
>
One of the 'reason' for having 2 sets of filter was the presence of universal
'scanning' tool (aka udev) - which is assessing & reading devices in a system
and its combination with various 'VM' environments where actual device are
passed to guest systems on your hosting machine.
So there are many different combinations where different commands may need to
see different subset of devices - so i.e. your guest machine should not have
an impact on correctness of your 'hosting' machine no matter what guess will
write (i.e. duplicating signatures...)
While in many cases for many single home users with single set of devices this
can be seen maybe as an 'overkill' solution - in the more generic world where
there is unfortunately not yet any widely used/accepted solution solving the
core problem: 'who is the owner of a device' having several sets of filter
was the only solution we were able to create.
It's worth to note lvm2 is solving way more issues then other similar device
technology (i.e. mdraid, btrfs....) where it's very simple to cause big
confusion and data corruptions (even unnoticed) once duplicates appears in
your system...
Zdenek
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-15 12:40 ` Zdenek Kabelac
@ 2020-02-15 19:15 ` Gionatan Danti
2020-02-15 20:19 ` Zdenek Kabelac
2020-02-15 20:49 ` Chris Murphy
0 siblings, 2 replies; 12+ messages in thread
From: Gionatan Danti @ 2020-02-15 19:15 UTC (permalink / raw)
To: Zdenek Kabelac
Cc: David, Teigland, heming.zhao, LVM general discussion and development
Il 2020-02-15 13:40 Zdenek Kabelac ha scritto:
> Dne 14. 02. 20 v 21:40 David Teigland napsal(a):
>> On Fri, Feb 14, 2020 at 08:34:19PM +0100, Gionatan Danti wrote:
>>> Hi David, being filters one of the most asked questions, can I ask
>>> why we
>>> have so many different filters, leading to such complex interactions
>>> and
>>> behaviors?
>>>
>>> Don't get me wrong: I am sure you (the lvm team) have very good
>>> reasons to
>>> do that, and I am surely missing something? But what, precisely? How
>>> should
>>> we (end users) consider filters? Should we only use global_filter?
>>
>> You're right, filters are difficult to understand and use correctly.
>> The
>> complexity and confusion in the code is no better. With the removal
>> of
>> lvmetad in 2.03 versions (e.g. RHEL8) there's no difference between
>> filter
>> and global_filter, so that's some small improvement. But, I think
>> filters
>> should be replaced or overhauled with something easier to use and more
>> useful at a technical level.
>>
>> I've created a bz about that and welcome thoughts about what a
>> replacement
>> should or should not be like. With input the work is more likely to
>> be
>> prioritized.
>>
>
> One of the 'reason' for having 2 sets of filter was the presence of
> universal 'scanning' tool (aka udev) - which is assessing & reading
> devices in a system and its combination with various 'VM' environments
> where actual device are passed to guest systems on your hosting
> machine.
>
> So there are many different combinations where different commands may
> need to see different subset of devices - so i.e. your guest machine
> should not have an impact on correctness of your 'hosting' machine no
> matter what guess will write (i.e. duplicating signatures...)
Sure. But why having a single, valid filter set is not sufficient? In
other words, why/when I can not simply using global_filter, ignoring
"plain" filter?
> While in many cases for many single home users with single set of
> devices this can be seen maybe as an 'overkill' solution - in the more
> generic world where there is unfortunately not yet any widely
> used/accepted solution solving the core problem: 'who is the owner of
> a device' having several sets of filter was the only solution we were
> able to create.
True. I myself saw some setup where hosts had direct visibility of
guest-created logical volumes. The obvious solution was to correctly set
global_filter. However, I have the impression that a good share of
complexity/issues/unexpected behaviors are due to LVM being able to be
nested (PV inside LV inside VG inside PV inside ...)
> It's worth to note lvm2 is solving way more issues then other similar
> device technology (i.e. mdraid, btrfs....) where it's very simple to
> cause big confusion and data corruptions (even unnoticed) once
> duplicates appears in your system...
>
> Zdenek
I never duplicate devices with mdraid, but BTRFS is so fragile that
taking a simple LVM snapshot of a BTRFS component device can lead to
data corruption.
I really think the gold standard here is ZFS.
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it [1]
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-15 19:15 ` Gionatan Danti
@ 2020-02-15 20:19 ` Zdenek Kabelac
2020-02-16 15:17 ` Gionatan Danti
2020-02-15 20:49 ` Chris Murphy
1 sibling, 1 reply; 12+ messages in thread
From: Zdenek Kabelac @ 2020-02-15 20:19 UTC (permalink / raw)
To: Gionatan Danti
Cc: heming.zhao, David Teigland, LVM general discussion and development
Dne 15. 02. 20 v 20:15 Gionatan Danti napsal(a):
> Il 2020-02-15 13:40 Zdenek Kabelac ha scritto:
>> Dne 14. 02. 20 v 21:40 David Teigland napsal(a):
>>> On Fri, Feb 14, 2020 at 08:34:19PM +0100, Gionatan Danti wrote:
>>>> Hi David, being filters one of the most asked questions, can I ask why we
>>>> have so many different filters, leading to such complex interactions and
>>>> behaviors?
>>>>
>>>> Don't get me wrong: I am sure you (the lvm team) have very good reasons to
>>>> do that, and I am surely missing something? But what, precisely? How should
>>>> we (end users) consider filters? Should we only use global_filter?
>>>
>>> You're right, filters are difficult to understand and use correctly. The
>>> complexity and confusion in the code is no better.� With the removal of
>>> lvmetad in 2.03 versions (e.g. RHEL8) there's no difference between filter
>>> and global_filter, so that's some small improvement.� But, I think filters
>>> should be replaced or overhauled with something easier to use and more
>>> useful at a technical level.
>>>
>>> I've created a bz about that and welcome thoughts about what a replacement
>>> should or should not be like.� With input the work is more likely to be
>>> prioritized.
>>>
>>
>> One of the 'reason' for having 2 sets of filter was the presence of
>> universal 'scanning' tool (aka udev) - which is assessing & reading
>> devices in a system and its combination with various 'VM' environments
>> where actual device are passed to guest systems on your hosting
>> machine.
>>
>> So there are many different combinations where different commands may
>> need to see different subset of devices - so i.e. your guest machine
>> should not have an impact on correctness of your 'hosting' machine no
>> matter what guess will write (i.e. duplicating signatures...)
>
> Sure. But why having a single, valid filter set is not sufficient? In other
> words, why/when I can not simply using global_filter, ignoring "plain" filter?
The problem with simple filter - that was 'tried' to be resolved for lvmetad was:
udev should 'see' all devices in your system - so lvmetad should know about
all devices in the system (even with duplicates and all sort of
inconsistencies and garbage) - the idea was 'nice', but the actual
implementation itself was rising more troubles that it has been solving.
But ATM - we still have sort of 'pvscan' from udev
and lvm command run by admin - which can run with different '--config'.
So the 'current' (ATM) difference is:
global_filter - never scan such devices on a machine
filter - never scan device within a single command.
and the idea is - you can have 'different' sets of command operating on
different subset of device on your machine - which might be useful in the
world of 'containers' & VMs & clusters...
So while 'global_filter' should mostly never change - the change of filter is
kind of ok during system's lifetime.
When there is no lvmetad anymore - having 2 different 'filter' settings is
now 'less' fancy and both cases could be somehow solved with just a single
filter (as there is simply no cache and there is always some scan) -
but the correctness with VMs and other bigger systems could be better handled
with 2 filter levels - where basically 'admin' sets 'hard' borders with
global_filter - and tools can play with 'filter' with already preselected
subset of devices...
As has been said - it's not too much useful if there are just couple of disks
:)...
>> It's worth to note lvm2 is solving way more issues then other similar
>> device technology (i.e. mdraid, btrfs....) where it's very simple to
>> cause big confusion and data corruptions (even unnoticed) once
>> duplicates appears in your system...
>>
>> Zdenek
>
> I never duplicate devices with mdraid, but BTRFS is so fragile that taking a
> simple LVM snapshot of a BTRFS component device can lead to data corruption.
>
> I really think the gold standard here is ZFS.
IMHO ZFS is 'somewhat' slow to play with...
and I've no idea how ZFS can resolve all correctness issues in kernel...
Zdenek
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-15 20:19 ` Zdenek Kabelac
@ 2020-02-16 15:17 ` Gionatan Danti
0 siblings, 0 replies; 12+ messages in thread
From: Gionatan Danti @ 2020-02-16 15:17 UTC (permalink / raw)
To: Zdenek Kabelac
Cc: David, Teigland, heming.zhao, LVM general discussion and development
Il 2020-02-15 21:19 Zdenek Kabelac ha scritto:
> IMHO ZFS is 'somewhat' slow to play with...
> and I've no idea how ZFS can resolve all correctness issues in
> kernel...
>
> Zdenek
Oh, it surely does *not* solve all correctness issues. Rather, having
much simpler constrains (and use cases), it simply avoids many issues.
That said, what LVM achieve despite all abstraction layers and very
different goals/use cases really is impressive.
So, thanks to the LVM team for the hard work!
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it [1]
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-15 19:15 ` Gionatan Danti
2020-02-15 20:19 ` Zdenek Kabelac
@ 2020-02-15 20:49 ` Chris Murphy
2020-02-16 15:28 ` Gionatan Danti
1 sibling, 1 reply; 12+ messages in thread
From: Chris Murphy @ 2020-02-15 20:49 UTC (permalink / raw)
To: LVM general discussion and development
On Sat, Feb 15, 2020 at 12:22 PM Gionatan Danti <g.danti@assyoma.it> wrote:
>
> Il 2020-02-15 13:40 Zdenek Kabelac ha scritto:
> > It's worth to note lvm2 is solving way more issues then other similar
> > device technology (i.e. mdraid, btrfs....) where it's very simple to
> > cause big confusion and data corruptions (even unnoticed) once
> > duplicates appears in your system...
> >
> > Zdenek
>
> I never duplicate devices with mdraid, but BTRFS is so fragile that
> taking a simple LVM snapshot of a BTRFS component device can lead to
> data corruption.
>
> I really think the gold standard here is ZFS.
Are you referring to this known problem?
https://btrfs.wiki.kernel.org/index.php/Gotchas#Block-level_copies_of_devices
By default the snapshot LV isn't active, so the problem doesn't
happen. I've taken many LVM thinp snapshots of Btrfs file systems,
including while they're actively being written to, and never run into
this problem (or any other).
An LVM snapshot comes with FIFREEZE, and supported filesystems,
including Btrfs, should have a consistent snapshot created as a
result. I don't think ZFS supports FIFREEZE/FITHAW and if that's
correct, you're effectively getting a powerfail/crash type behavior
with an LVM snapshot of a ZFS file system, entirely trusting on its
own ability to maintain file system consistency.
My dualist opinion on mixing these layers: while it should work, and
if there's corruption then there's a bug somewhere, adding layers
increases complexity and thus risk. That's possibly a good idea in a
testing/qualification context, where you want something sensitive to
and consistently flags any discrepancy. That's not fragility.
--
Chris Murphy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-15 20:49 ` Chris Murphy
@ 2020-02-16 15:28 ` Gionatan Danti
0 siblings, 0 replies; 12+ messages in thread
From: Gionatan Danti @ 2020-02-16 15:28 UTC (permalink / raw)
To: LVM general discussion and development; +Cc: Chris Murphy
Il 2020-02-15 21:49 Chris Murphy ha scritto:
> Are you referring to this known problem?
> https://btrfs.wiki.kernel.org/index.php/Gotchas#Block-level_copies_of_devices
Yes.
> By default the snapshot LV isn't active, so the problem doesn't
> happen. I've taken many LVM thinp snapshots of Btrfs file systems,
> including while they're actively being written to, and never run into
> this problem (or any other).
Thin LVM snapshots are not active by default, yes. But you *need* to
activate them to access their data.
Moreover, classical (non-thin) LVM snapshot are automatically activated
when taken.
> An LVM snapshot comes with FIFREEZE, and supported filesystems,
> including Btrfs, should have a consistent snapshot created as a
> result. I don't think ZFS supports FIFREEZE/FITHAW and if that's
> correct, you're effectively getting a powerfail/crash type behavior
> with an LVM snapshot of a ZFS file system, entirely trusting on its
> own ability to maintain file system consistency.
True, but the transactional nature of ZFS writes means that a clean
recovery option should always be available. Anyway, any modern journaled
filesystem will not corrupt itself on power loss/recovery (async write
back data will be lost, obviously).
> My dualist opinion on mixing these layers: while it should work, and
> if there's corruption then there's a bug somewhere, adding layers
> increases complexity and thus risk. That's possibly a good idea in a
> testing/qualification context, where you want something sensitive to
> and consistently flags any discrepancy. That's not fragility.
I am not sure about that: one of BTRFS main goal was to not duplicate
code, relying on standard Linux block device behavior as much as
possible. For this reason, I tend to think that snapshotting (and using)
the block device under a BTRFS device should be a supported use case.
But hey - the LVM team is really doing an awesome work!
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it [1]
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
2020-02-14 20:40 ` David Teigland
2020-02-15 5:22 ` heming.zhao
2020-02-15 12:40 ` Zdenek Kabelac
@ 2020-02-15 19:07 ` Gionatan Danti
2 siblings, 0 replies; 12+ messages in thread
From: Gionatan Danti @ 2020-02-15 19:07 UTC (permalink / raw)
To: David Teigland; +Cc: linux-lvm, heming.zhao
Il 2020-02-14 21:40 David Teigland ha scritto:
> You're right, filters are difficult to understand and use correctly.
> The
> complexity and confusion in the code is no better. With the removal of
> lvmetad in 2.03 versions (e.g. RHEL8) there's no difference between
> filter
> and global_filter, so that's some small improvement. But, I think
> filters
> should be replaced or overhauled with something easier to use and more
> useful at a technical level.
>
> I've created a bz about that and welcome thoughts about what a
> replacement
> should or should not be like. With input the work is more likely to be
> prioritized.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1803266
Hi David, I think that part of the problem is the unclear/vague
description of filters (eg: "plain" filter vs global_filter). In other
words, maybe the real problem is a documentation one.
For example: am I right saying that global_filter were introduced as a
"fail-safe" mechanism to protect udev & the likes by
command-line-overwritten "plain" filter directive?
If so, I am not sure the comment in lvm.conf fully convey this message
(and I can not find much on man pages, also). If not, and I am wrong
about filter vs global_filter, then, well, this somewhat proves the
point above :)
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it [1]
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8
^ permalink raw reply [flat|nested] 12+ messages in thread