From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 References: In-Reply-To: From: Eric Ren Date: Thu, 11 Apr 2019 18:01:30 +0800 Message-ID: Content-Type: multipart/alternative; boundary="000000000000db99ad05863e451a" Subject: Re: [linux-lvm] Aborting. LV mythinpool_tmeta is now incomplete Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: To: LVM general discussion and development , lvm-devel@redhat.com, thornber@redhat.com --000000000000db99ad05863e451a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, Another error message is: " Failed to suspend thin snapshot origin ..." which is in _lv_create_an_lv(): ``` 7829 } else if (lv_is_thin_volume(lv)) { 7830 /* For snapshot, suspend active thin origin first */ 7831 if (origin_lv && lv_is_active(origin_lv) && lv_is_thin_volume(origin_lv)) { 7832 if (!suspend_lv_origin(cmd, origin_lv)) { 7833 log_error("Failed to suspend thin snapshot origin %s/%s.", 7834 origin_lv->vg->name, origin_lv->name); 7835 goto revert_new_lv; 7836 } 7837 if (!resume_lv_origin(cmd, origin_lv)) { /* deptree updates thin-pool */ 7838 log_error("Failed to resume thin snapshot origin %s/%s.", 7839 origin_lv->vg->name, origin_lv->name); 7840 goto revert_new_lv; 7841 } 7842 /* At this point remove pool messages, snapshot is active */ 7843 if (!update_pool_lv(pool_lv, 0)) { 7844 stack; 7845 goto revert_new_lv; 7846 } ``` I don't understand why we need to suspend_lv_origin() and resume_lv_origin() in line? And, what reasons might cause this errors? Regards, Eric On Thu, 11 Apr 2019 at 08:27, Eric Ren wrote: > Hello list, > > Recently, we're exercising our container environment which uses lvm to > manage thin LVs, meanwhile we found a very strange error to activate the > thin LV: > > =E2=80=9CAborting. LV mythinpool_tmeta is now incomplete and '--activati= onmode > partial' was not specified.\n: exit status 5: unknown" > > centos 7.6 > # lvm version > LVM version: 2.02.180(2)-RHEL7 (2018-07-20) > Library version: 1.02.149-RHEL7 (2018-07-20) > Driver version: 4.35.0 > > It has appeared several times, but can not be reproduced easily by simple > steps, and it only errors at that moment, after it happens everything see= ms > OK but only that activation failed. > > Looking at the code a bit. At first, I suspect the PV may disappear for > some reason, but the VG sits on only one PV, the setup is simple, the > environment is only for testing purposes, it seems unlikely the PV has > problem at that moment and I don't see any problem message with the disk. > > ``` > > 2513 /* FIXME Avoid repeating identical stat in dm_tree_node_add_= target_area */ > 2514 for (s =3D start_area; s < areas; s++) { > 2515 if ((seg_type(seg, s) =3D=3D AREA_PV && > 2516 (!seg_pvseg(seg, s) || !seg_pv(seg, s) || !seg_= dev(seg, s) || > 2517 !(name =3D dev_name(seg_dev(seg, s))) || !*na= me || > 2518 stat(name, &info) < 0 || !S_ISBLK(info.st_mod= e))) || > 2519 (seg_type(seg, s) =3D=3D AREA_LV && !seg_lv(seg,= s))) { > 2520 if (!seg->lv->vg->cmd->partial_activation) { > 2521 if (!seg->lv->vg->cmd->degraded_acti= vation || > 2522 !lv_is_raid_type(seg->lv)) { > 2523 log_error("Aborting. LV %s = is now incomplete " > 2524 "and '--activation= mode partial' was not specified.", > 2525 display_lvname(seg= ->lv)); > 2526 return 0; > > ``` > So, does anyone see the same problem? Or any hints to hunt the root cause= ? > Any suggestion would be welcome! > > Regards, > Eric > --=20 - Eric Ren --000000000000db99ad05863e451a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

Another error message is:=

"=C2=A0Fa= iled to suspend thin snapshot origin=C2=A0..."

which is in=C2=A0_lv_creat= e_an_lv():

```
7829=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0} else if (lv_is_thin_volume(lv)) {
7830=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* For snapshot,= suspend active thin origin first */
7831=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (origin_lv && lv_is_active= (origin_lv) && lv_is_thin_volume(origin_lv)) {
7832=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if (!suspend_lv_origin(cmd, origin_lv)) {
7833=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0log_error("Failed to suspend thi= n snapshot origin %s/%s.",
7834=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0origin_lv->vg->na= me, origin_lv->name);
7835=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0goto revert_new_lv;
7836=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
7837=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!resume_lv_origin(cmd, origin_lv)) { /* dept= ree updates thin-pool */
7838=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0log_error("Failed to resume thin snapshot origin %s/%s."= ;,
7839=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0origin_lv->vg->name, origin_lv->name);
7840=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto revert_new_lv;<= /div>
7841=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
7842=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* At this po= int remove pool messages, snapshot is active */
7843=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0if (!update_pool_lv(pool_lv, 0)) {
7844=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0stack;
7845=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0goto revert_new_lv;
7846=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
```

I don't understand why we need to= =C2=A0suspend_lv_origin() and=C2=A0resume_lv_origin() in line?
And, what reasons might cause this errors?

Regards,
Eric



<= /div>

On Thu, 11 Apr 2019 at 08:27, Eric Ren <renzhengeek@gmail.= com> wrote:
Hello list,

<= /div>
Recently, we're exercising our container environment which us= es lvm to manage thin LVs, meanwhile we found a very strange error to activ= ate the thin LV:

=E2=80=9CAborting.=C2=A0 LV mythi= npool_tmeta is now incomplete and '--activationmode partial' was no= t specified.\n: exit status 5: unknown"

cento= s 7.6
# lvm version
=C2=A0 LVM version:=C2=A0 =C2= =A0 =C2=A02.02.180(2)-RHEL7 (2018-07-20)
=C2=A0 Library version: = 1.02.149-RHEL7 (2018-07-20)
=C2=A0 Driver version:=C2=A0 4.35.0

It has appeared several times, but can not be= reproduced easily by simple steps, and it only errors at that moment, afte= r it happens everything seems OK but only that activation failed.

Looking at the code a bit. At first, I suspect the PV may d= isappear for some reason, but the VG sits on only one PV, the setup is simp= le, the environment is only for testing purposes, it seems unlikely the PV = has problem at that moment and I don't see any problem message with the= disk.

```
2513      =
   /* FIXME Avoid repeating identical stat in dm_tree_node_add_target_area =
*/
2514 for (s =3D start_area; s= < areas; s++) {
2515 = if ((seg_type(seg, s) =3D=3D AREA_PV &&
2516 (!seg_pvseg(seg, s) || !seg_pv(seg, s) |= | !seg_dev(seg, s) ||
2517 = !(name =3D dev_name(seg_dev(seg, s))) || !*name ||
2518 stat(name, &info) <= 0 || !S_ISBLK(info.st_mode))) ||
2519 = (seg_type(seg, s) =3D=3D AREA_LV && !seg_lv(seg,= s))) {
2520 if = (!seg->lv->vg->cmd->partial_activation) {
2521 if (!seg->lv->vg-&= gt;cmd->degraded_activation ||
2522 = !lv_is_raid_type(seg->lv)) {
2523 log= _error("Aborting. LV %s is now incomplete "
2524 "= ;and '--activationmode partial' was not specified.",
2525 = display_lvname(seg->lv));
2526= return 0;
```
So, does anyone see the same problem? Or any hints to hunt the root c= ause? Any suggestion would be welcome!

Regards,
Eric


--
- Er= ic Ren
--000000000000db99ad05863e451a--