From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mimecast-mx02.redhat.com (mimecast04.extmail.prod.ext.rdu2.redhat.com [10.11.55.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0AB351006176 for ; Wed, 23 Sep 2020 18:13:19 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9CC86101A53F for ; Wed, 23 Sep 2020 18:13:19 +0000 (UTC) MIME-Version: 1.0 References: <73d0ffcd-4ed5-38b1-0d17-a4b16c7863d6@redhat.com> In-Reply-To: From: Duncan Townsend Date: Wed, 23 Sep 2020 13:13:01 -0500 Message-ID: Content-Type: multipart/alternative; boundary="0000000000007095b205afff097f" Subject: Re: [linux-lvm] thin: pool target too small Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: To: Zdenek Kabelac Cc: LVM general discussion and development --0000000000007095b205afff097f Content-Type: text/plain; charset="UTF-8" On Tue, Sep 22, 2020, 5:02 PM Zdenek Kabelac wrote: > Dne 21. 09. 20 v 15:47 Duncan Townsend napsal(a): > > On Mon, Sep 21, 2020 at 5:23 AM Zdenek Kabelac > wrote: > >> > >> Dne 21. 09. 20 v 1:48 Duncan Townsend napsal(a): > >>> Hello! > > > > Ahh, thank you for the reminder. My apologies for not including this > > in my original message. I use Void Linux on aarch64-musl: > > > >>> I had a problem with a runit script that caused my dmeventd to be > >>> killed and restarted every 5 seconds. The script has been fixed, but > >> > >> Kill dmeventd is always BAD plan. > >> Either you do not want monitoring (set to 0 in lvm.conf) - or > >> leave it to it jobs - kill dmeventd in the middle of its work > >> isn't going to end well...) > > > > Thank you for reinforcing this. My runit script was fighting with > > dracut in my initramfs. My runit script saw that there was a dmeventd > > not under its control, and so tried to kill the one started by dracut. > > I've gone and disabled the runit script and replaced it with a stub > > that simply tried to kill the dracut-started dmeventd when it receives > > a signal. > > Ok - so from looking at the resulting 'mixture' of metadata you > have in your archive and what physically present on PV header are > and now noticing this is some 'not-so-standard' distro - I'm starting to > suspect the reason of all these troubles. > I'm running void linux, and I haven't messed with the initramfs at all from the defaults. I'll report this to the distro maintainers. Thanks! Withing 'dracut' you shouldn't be firering dmeventd at all - monitoring > should be enabled (by vgchange --monitor y) once you switch to your rootfs > so dmenventd is execute from your rootfs! > > By letting 'dmeventd' running in your 'dracut world' - it lives in its > own environment and likely its very own locking dir. > > That makes it very easy your dmeventd runs in parallel with your other > lvm2 > commands - and since this way it's completely unprotected > (as file-locking is what it is) - as the resize is operation for several > seconds it has happened your 'admins' command replaced whatever dmeventd > was > doing. > dmeventd does write its PID file into the correct directory in the post-initramfs root, so whatever's happening is some weird hybrid. I'll debug this further with my distro. So I think to prevent repeated occurrence of this problem - you'll need > to ensure your system-booting will follow the pattern from distros > like Fedora. > I think for now, the easiest solution may be to try to stop dmeventd from being started by dracut. I have encountered a further problem in the process of restoring my thin pool to a working state. After using vgcfgrestore to fix the mismatching metadata using the file Zdenek kindly provided privately, when I try to activate my thin LVs, I'm now getting the error message: Thin pool -tpool transaction_id (MAJOR:MINOR) transaction_id is XXX, while expected YYY. where XXX == YYY + 2. Is this indicative of a deeper problem? I'm hesitant to run any further operations on this volume without advice because I think I'm in fairly uncharted territory now. Thank you again for all your help! --Duncan Townsend P.S. This was typed on mobile. Please forgive any typos. > --0000000000007095b205afff097f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Sep 22, 2020, 5:02 PM Zdenek Kabelac <zkabelac@redhat.com> wrote:
Dne 21. 09. 20 v 15:47 Duncan Townsend napsal(a):=
> On Mon, Sep 21, 2020 at 5:23 AM Zdenek Kabelac <zkabelac@redhat.co= m> wrote:
>>
>> Dne 21. 09. 20 v 1:48 Duncan Townsend napsal(a):
>>> Hello!
>
> Ahh, thank you for the reminder. My apologies for not including this > in my original message. I use Void Linux on aarch64-musl:
>
>>> I had a problem with a runit script that caused my dmeventd to= be
>>> killed and restarted every 5 seconds. The script has been fixe= d, but
>>
>> Kill dmeventd is always BAD plan.
>> Either you do not want monitoring (set to 0 in lvm.conf) - or
>> leave it to it jobs - kill dmeventd in the middle of its work
>> isn't going to end well...)
>
> Thank you for reinforcing this. My runit script was fighting with
> dracut in my initramfs. My runit script saw that there was a dmeventd<= br> > not under its control, and so tried to kill the one started by dracut.=
> I've gone and disabled the runit script and replaced it with a stu= b
> that simply tried to kill the dracut-started dmeventd when it receives=
> a signal.

Ok - so from looking at the resulting 'mixture' of metadata you
have in your archive and what physically present on PV header are
and now noticing this is some 'not-so-standard' distro - I'm st= arting to
suspect the reason of all these troubles.

I'm running void linux, and I = haven't messed with the initramfs at all from the defaults. I'll re= port this to the distro maintainers. Thanks!

Withing 'dracut' you shouldn't be firering dmeventd at all - mo= nitoring
should be enabled=C2=A0 (by vgchange --monitor y) once you switch to your r= ootfs
so dmenventd is execute from your rootfs!

By letting 'dmeventd' running in your 'dracut world' - it l= ives in its
own environment and likely its very own locking dir.

That makes it very easy your dmeventd runs in parallel with your other lvm2=
commands - and since this way it's completely unprotected
(as file-locking is what it is) - as the resize is=C2=A0 operation for seve= ral
seconds it has happened your 'admins' command replaced whatever dme= ventd was
doing.

dmeventd does write its PID file into the correct directory in the po= st-initramfs root, so whatever's happening is some weird hybrid. I'= ll debug this further with my distro.

So I think to prevent repeated occurrence of this problem - you'll need=
to ensure your system-booting will follow the pattern from distros
like Fedora.

I think for now, the easiest solution may be to try to stop dme= ventd from being started by dracut.

I have encountered a further problem in the process of restorin= g my thin pool to a working state. After using vgcfgrestore to fix the mism= atching metadata using the file=C2=A0Zdenek kindly provided privately, when I try to activate my thin LVs, I= 9;m now getting the error message:

Thin pool <THIN POOL LONG NAME>-tpool tra= nsaction_id (MAJOR:MINOR) transaction_id is XXX, while expected YYY.=

<= /div>
where XXX =3D= =3D YYY + 2. Is this indicative of a deeper problem? I'm hesitant to ru= n any further operations on this volume without advice because I think I= 9;m in fairly uncharted territory now.

Thank you again for all your help!
--Duncan Towns= end
P.S= . This was typed on mobile. Please forgive any typos.
--0000000000007095b205afff097f--