On Sat, Sep 26, 2020, 8:30 AM Duncan Townsend <duncancmt@gmail.com> wrote:

On Thu, Sep 24, 2020 at 1:54 PM Zdenek Kabelac <zkabelac@redhat.com> wrote:
> Dne 23. 09. 20 v 21:54 Duncan Townsend napsal(a):
> > On Wed, Sep 23, 2020 at 2:49 PM Zdenek Kabelac <zkabelac@redhat.com> wrote:
> >> Dne 23. 09. 20 v 20:13 Duncan Townsend napsal(a):
> > I apologize, but I am back with a related, similar problem. After
> > editing the metadata file and replacing the transaction number, my
> > system became serviceable again. After making absolutely sure that
> > dmeventd was running correctly, my next order of business was to
> > finish backing up before any other tragedy happens. Unfortunately,
> > taking a snapshot as part of the backup process has once again brought
> > my system to its knees. The first error message I saw was:
>
> Hi
>
> And now you've hit an interesting bug inside lvm2 code - I've opened new BZ
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1882483
>
> This actually explains few so far not well understood problems I've
> seen before without good explanation how to hit them.

Ahh! Exciting! Thanks for opening the bugzilla. Is there any
additional information I could supply to make this easier?

> > WARNING: Sum of all thin volume sizes (XXX TiB) exceeds the size of
> > thin pool <VG>/<THIN POOL LV> and the size of whole volume group (YYY
> > TiB).
> > device-mapper: message ioctl on (MAJOR:MINOR) failed: File exists
> > Failed to process thin pool message "create_snap 11 4".
> > Failed to suspend thin snapshot origin <VG>/<THIN LV>.
> > Internal error: Writing metadata in critical section.
> > Releasing activation in critical section.
> > libdevmapper exiting with 1 device(s) still suspended.
>
> So I've now quite simple reproducer for unhanded error case.
> It's basically exposing mismatch between kernel (_tmeta) and lvm2
> metadata content. And lvm2 can handle this discovery better
> than what you see now,
>
> > There were further error messages as further snapshots were attempted,
> > but I was unable to capture them as my system went down. Upon reboot,
> > the "transaction_id" message that I referred to in my previous message
> > was repeated (but with increased transaction IDs).
>
> For better fix it would need to be better understood what has happened
> in parallel while 'lvm' inside dmeventd was resizing pool data.

To the best of my knowledge, no other LVM operations were in flight at
the time. The script that I use issues LVM commands strictly
sequentially and there are no other daemons that make use of LVM
(except dmeventd and lvmetad, of course). Is there something else I
should be looking out for? Is there another log that I can examine?

> It looks like the 'other' lvm managed to create another snapshot
> (and thus the DeviceID appeared to already exists - while it should not
> according to lvm2 metadata before it hit problem with mismatch of
> transaction_id.
>
> > I will reply privately with my lvm metadata archive and with my
> > header. My profuse thanks, again, for assisting me getting my system
> > back up and running.
>
> So the valid fix would be to take 'thin_dump' of kernel metadata
> (aka content of _tmeta device)
> Then check what you have in lvm2 metadata and likely you will
> find some device in kernel - for which you don't have match
> in lvm2 metadata - these devices would need to be copied
> from your other sequence of lvm2 metadata.

I'll reply privately with the compressed output of thin_dump.

> Other maybe more simple way could be to just remove devices
> from xml thin_dump and thin_restore those metadata that should should
> now match lvm2.
>
> The last issue is then to match 'transaction_id' with the number
> stored in kernel metadata.
>
> So not sure which way you want to go and how important those
> snapshot (that could be dropped) are ?

Would it be reasonable to use vgcfgrestore again on the
manually-repaired metadata I used before? I'm not entirely sure what
to look for while editing the XML from thin_dump, and I would very
much like to avoid causing further damage to my system. (Also, FWIW,
thin_dump appears to segfault when run with musl-libc instead of
glibc. I had to use a glibc recovery image to get this dump. I'm
following up with the void distro maintainers about this.)

I plan to "just" create a new thin pool and copy all the thin LVs over
to the new pool. I'm hoping that that will let me wash my hands of
this matter and restore my system to a stable state. The snapshots
that I create are purely read-only so that the slow-moving backup
process can get an atomic view of the filesystem in use. They contain
no important information that isn't also captured in the origin LV
itself.

Hello again!

Could someone advise whether the above strategy (restoring my metadata with vgcfgrestore, making a new thin pool, and copying the contents of my thin LVs over at the filesystem level) is likely to cause data loss? Is it likely to solve my ongoing problem with this frustrating thin pool?

Thank you for your advice!

--Duncan Townsend

P.S. This was written on mobile. Please forgive my typos.