linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
@ 2018-06-03 20:13 Inbox
  2018-06-03 20:18 ` Inbox
  2018-06-03 22:21 ` Inbox
  0 siblings, 2 replies; 12+ messages in thread
From: Inbox @ 2018-06-03 20:13 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]

I'm aware I can't let my thin volume get full.  I'm actually about to
delete a lot of things.



I don't understand why it gave the sdh3 then sdh3, sdg3, sdf3 no space left
on device errors.  sdf3 has 366G left in its thin pool, and I asked to
create a virtual 200G within it.

I don't understand why it failed to write VG, or an MDA of VG.

I'm most concerned if anything's corrupted now, or if I can ignore this
other than that I couldn't create a volume.

disk1thin is on sdh3
disk2thin is on sdg3
disk3thin is on sdf3
disk4thin is on sde3

# lvs
...
  disk1thin                       lvm    twi-aot---  <4.53t
     84.13  76.33
  disk2thin                       lvm    twi-aot---  <4.53t
     85.98  78.09
  disk3thin                       lvm    twi-aot---  <4.53t
     92.10  83.47
  disk4thin                       lvm    twi-aot---   4.53t
     80.99  36.91
...
# lvcreate -V200G lvm/disk3thin -n test3
  WARNING: Sum of all thin volume sizes (21.22 TiB) exceeds the size of
thin pools and the size of whole volume group (<18.17 TiB).
  WARNING: You have not turned on protection against thin pools running out
of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to
trigger automatic extension of thin pools before they get full.
  /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
  Failed to write VG lvm.
  Failed to write VG lvm.
  Manual intervention may be required to remove abandoned LV(s) before
retrying.
# lvremove lvm/test3
  /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
  WARNING: Failed to write an MDA of VG lvm.
  /dev/sdg3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
  WARNING: Failed to write an MDA of VG lvm.
  /dev/sdf3: write failed after 24064 of 24576@4993488961536: No space
left on device
  WARNING: Failed to write an MDA of VG lvm.
  Logical volume "test3" successfully removed
# lvs --- shows test3 is gone

# pvs
  PV             VG     Fmt  Attr PSize    PFree
  /dev/sde3      lvm    lvm2 a--     4.54t <10.70g
  /dev/sdf3      lvm    lvm2 a--     4.54t      0
  /dev/sdg3      lvm    lvm2 a--     4.54t      0
  /dev/sdh3      lvm    lvm2 a--     4.54t      0
# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  lvm      4  51   0 wz--n- <18.17t <10.70g

[-- Attachment #2: Type: text/html, Size: 3260 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-03 20:13 [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA" Inbox
@ 2018-06-03 20:18 ` Inbox
  2018-06-03 22:21 ` Inbox
  1 sibling, 0 replies; 12+ messages in thread
From: Inbox @ 2018-06-03 20:18 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 2559 bytes --]

Kernel 4.16.8, lvm 2.02.177.

On Sun, Jun 3, 2018 at 1:13 PM, Inbox <jimhaddad46@gmail.com> wrote:

> I'm aware I can't let my thin volume get full.  I'm actually about to
> delete a lot of things.
>
>
>
> I don't understand why it gave the sdh3 then sdh3, sdg3, sdf3 no space
> left on device errors.  sdf3 has 366G left in its thin pool, and I asked to
> create a virtual 200G within it.
>
> I don't understand why it failed to write VG, or an MDA of VG.
>
> I'm most concerned if anything's corrupted now, or if I can ignore this
> other than that I couldn't create a volume.
>
> disk1thin is on sdh3
> disk2thin is on sdg3
> disk3thin is on sdf3
> disk4thin is on sde3
>
> # lvs
> ...
>   disk1thin                       lvm    twi-aot---  <4.53t
>      84.13  76.33
>   disk2thin                       lvm    twi-aot---  <4.53t
>      85.98  78.09
>   disk3thin                       lvm    twi-aot---  <4.53t
>      92.10  83.47
>   disk4thin                       lvm    twi-aot---   4.53t
>      80.99  36.91
> ...
> # lvcreate -V200G lvm/disk3thin -n test3
>   WARNING: Sum of all thin volume sizes (21.22 TiB) exceeds the size of
> thin pools and the size of whole volume group (<18.17 TiB).
>   WARNING: You have not turned on protection against thin pools running
> out of space.
>   WARNING: Set activation/thin_pool_autoextend_threshold below 100 to
> trigger automatic extension of thin pools before they get full.
>   /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   Failed to write VG lvm.
>   Failed to write VG lvm.
>   Manual intervention may be required to remove abandoned LV(s) before
> retrying.
> # lvremove lvm/test3
>   /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   WARNING: Failed to write an MDA of VG lvm.
>   /dev/sdg3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   WARNING: Failed to write an MDA of VG lvm.
>   /dev/sdf3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   WARNING: Failed to write an MDA of VG lvm.
>   Logical volume "test3" successfully removed
> # lvs --- shows test3 is gone
>
> # pvs
>   PV             VG     Fmt  Attr PSize    PFree
>   /dev/sde3      lvm    lvm2 a--     4.54t <10.70g
>   /dev/sdf3      lvm    lvm2 a--     4.54t      0
>   /dev/sdg3      lvm    lvm2 a--     4.54t      0
>   /dev/sdh3      lvm    lvm2 a--     4.54t      0
> # vgs
>   VG     #PV #LV #SN Attr   VSize   VFree
>   lvm      4  51   0 wz--n- <18.17t <10.70g
>
>

[-- Attachment #2: Type: text/html, Size: 3665 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-03 20:13 [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA" Inbox
  2018-06-03 20:18 ` Inbox
@ 2018-06-03 22:21 ` Inbox
  2018-06-03 23:57   ` Inbox
  1 sibling, 1 reply; 12+ messages in thread
From: Inbox @ 2018-06-03 22:21 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 3003 bytes --]

Sorry to be mailing again, think this info helps though...



On Sun, Jun 3, 2018 at 1:13 PM, Inbox <jimhaddad46@gmail.com> wrote:
...

> # lvcreate -V200G lvm/disk3thin -n test3
>
...

>   /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>
...

> # lvremove lvm/test3
>   /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   WARNING: Failed to write an MDA of VG lvm.
>   /dev/sdg3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   WARNING: Failed to write an MDA of VG lvm.
>   /dev/sdf3: write failed after 24064 of 24576 at 4993488961536: No space
> left on device
>   WARNING: Failed to write an MDA of VG lvm.
>

fdisk -l  shows sdf3, sdg3, and sdh3, sdg3, are 4993488985600 bytes.

After some reading, I'm guessing mda means metadata area.

If LVM is trying to write mda2 at 4993488961536, there's only 24064 bytes
left in the partition, which is exactly where it's saying the write is
failing.

I did use "--pvmetadatacopies 2" when running pvcreate.

So, am I right that it's trying to write more than 24k of metadata at the
end of the disk, but there's only 24k left at the end of the disk for a
metadata copy?

If so, where the other locations for metadata (both the main one, and the
first copy) only 24k?  Did I lose metadata in those areas?  Did it
overwrite what was after the metadata area?

Where do I go from here?

# pvdisplay --maps /dev/sdh3
  --- Physical volume ---
  PV Name               /dev/sdh3
  VG Name               lvm
  PV Size               4.54 TiB / not usable 2.19 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              1190540
  Free PE               0
  Allocated PE          1190540
  PV UUID               BK8suJ-dqiy-mdK4-IyUH-aeRR-TPUo-3Dj6JG

  --- Physical Segments ---
  Physical extent 0 to 127:
    Logical volume      /dev/lvm/disk1thin_tmeta
    Logical extents     32 to 159
  Physical extent 128 to 4095:
    Logical volume      /dev/lvm/swap1
    Logical extents     0 to 3967
  Physical extent 4096 to 4127:
    Logical volume      /dev/lvm/lvol0_pmspare
    Logical extents     0 to 31
  Physical extent 4128 to 132127:
    Logical volume      /dev/lvm/disk1thin_tdata
    Logical extents     0 to 127999
  Physical extent 132128 to 132159:
    Logical volume      /dev/lvm/disk1thin_tmeta
    Logical extents     0 to 31
  Physical extent 132160 to 1190539:
    Logical volume      /dev/lvm/disk1thin_tdata
    Logical extents     128000 to 1186379

Those physical extents translate to beginning/ending bytes (4MiB PE size)
of:

0 532676608
536870912 17175674880
17179869184 17309892608
17314086912 554180804608
554184998912 554315022336
554319216640 4993482489856


Disk size (4993488961536) - last physical extent ending byte (4993482489856
+ 4*1024*1024 assuming worst case scenario here, the x856 is the starting
byte of the last extent) still leaves 2277376 bytes, way more than 24k.

[-- Attachment #2: Type: text/html, Size: 5765 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-03 22:21 ` Inbox
@ 2018-06-03 23:57   ` Inbox
  2018-06-04  0:19     ` Inbox
  2018-06-04  6:54     ` Patrick Mitchell
  0 siblings, 2 replies; 12+ messages in thread
From: Inbox @ 2018-06-03 23:57 UTC (permalink / raw)
  To: linux-lvm

OK, I think I've traced this back to a bug in pvcreate with
"--pvmetadatacopies 2".  Or, at least a bug in lvm2 2.02.162 - the
version available back then when I started using this disk.

I found pvdissect, a python script from syslinux available here:
https://www.syslinux.org/wiki/index.php?title=Development/LVM_support/pvdissect
***

Working through its output (copied way below) it's looking to me like
lvm works within the pvmetadata space it has, overwriting out of data
metadata with new copies.

Ordinarily, I don't think this would be fatal.  If lvm works within
the space it has, this just means not as many old copies of metadata
will be kept.  But, the pvcreate bug left room for only 48,640 bytes
of xml in mda1 vs 966,656 in mda1.  As my "lvm = {" xml is 20,624
bytes, there's only room for 2 copies of the xml in mda1.

It must be this combination of too small of an xml area in mda1, with
a large "lvm = {" xml that doesn't allow LVM to work within such a
confined space, and try to write past the end of the disk.


The output way below below shows:

* disk size is correct (pv_header.device_size 0x48aa3231e00 is
4993488985600, bytes reported by fdisk)
* mda0 is located at 4096 bytes (pv_header.disk_areas.mda0.offset
0x1000 is 4096 bytes)
* mda0 is size 1044480 bytes (pv_header.disk_areas.md0.size 0xff000)
* mda1 is located at 4993488781312 bytes which is 204288 from last
disk byte (pv_header.disk_areas.mda1.offset 0x48aa3200000)
* mda1 is size 204288 bytes (pv_header.disk_areas.mda1.size 0x31e00)
* the mda checksums are now different (0xf0662726 vs 0xb46ba552)

So, it made mda1 to only be 19.5~% the size of mda0.

mda0 has room for xml of 966656 bytes.  (starts at 0x14000, mda0 goes
from 0x1000 for 0xff000 bytes, so to 0x100000 = EC000 available =
966656)

md1 only has room for xml of 48640 bytes.  (starts at 0x48aa3226000,
mda1 goes from 0x48aa3200000 for 0x31e00 bytes, so to 48AA3231E00 =
BE00 available = 48640)



*** pvdissect note: at end, per file's example usage, I had to add:
my_pv = PV()
my_pv.open("/dev/sdh3")
print my_pv
my_pv.close()



# python2 pvdissect.withPrint
0x00000200 (label_header.id):
    LABELONE
0x00000208 (label_header.sector):
    1
0x00000210 (label_header.crc):
    0x199978e7
0x00000214 (label_header.offset):
    0x20
0x00000218 (label_header.type):
    LVM2 001
0x00000220 (pv_header.uuid):
    BK8suJ-dqiy-mdK4-IyUH-aeRR-TPUo-3Dj6JG
0x00000240 (pv_header.device_size):
    0x48aa3231e00
0x00000248 (pv_header.disk_areas.da0.offset):
    0x100000
0x00000250 (pv_header.disk_areas.da0.size):
    0x0
0x00000268 (pv_header.disk_areas.mda0.offset):
    0x1000
0x00000270 (pv_header.disk_areas.mda0.size):
    0xff000
0x00000278 (pv_header.disk_areas.mda1.offset):
    0x48aa3200000
0x00000280 (pv_header.disk_areas.mda1.size):
    0x31e00
0x00001000 (mda_header.checksum):
    0xf0662726
0x00001004 (mda_header.magic):
     LVM2 x[5A%r0N*>
0x00001014 (mda_header.version):
    1
0x00001018 (mda_header.start):
    0x1000
0x00001020 (mda_header.size):
    0xff000
0x00001028 (mda_header.raw_locns0.offset):
    0x13000
0x00001030 (mda_header.raw_locns0.size):
    0x5072
0x00001038 (mda_header.raw_locns0.checksum):
    0xb13c7340
0x0000103c (mda_header.raw_locns0.flags):
    0
0x00001028 (mda_header.raw_locns0.offset):
    0x13000
0x00001030 (mda_header.raw_locns0.size):
    0x5072
0x00001038 (mda_header.raw_locns0.checksum):
    0xb13c7340
0x0000103c (mda_header.raw_locns0.flags):
    0
0x48aa3200000 (mda_header.checksum):
    0xb46ba552
0x48aa3200004 (mda_header.magic):
     LVM2 x[5A%r0N*>
0x48aa3200014 (mda_header.version):
    1
0x48aa3200018 (mda_header.start):
    0x48aa3200000
0x48aa3200020 (mda_header.size):
    0x31e00
0x48aa3200028 (mda_header.raw_locns0.offset):
    0x26000
0x48aa3200030 (mda_header.raw_locns0.size):
    0x51c8
0x48aa3200038 (mda_header.raw_locns0.checksum):
    0xe5da72ee
0x48aa320003c (mda_header.raw_locns0.flags):
    0
0x48aa3200028 (mda_header.raw_locns0.offset):
    0x26000
0x48aa3200030 (mda_header.raw_locns0.size):
    0x51c8
0x48aa3200038 (mda_header.raw_locns0.checksum):
    0xe5da72ee
0x48aa320003c (mda_header.raw_locns0.flags):
    0
0x00014000 (metadata.value):
...
0x48aa3226000 (metadata.value):
...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-03 23:57   ` Inbox
@ 2018-06-04  0:19     ` Inbox
  2018-06-04  0:46       ` Jim Haddad
  2018-06-04  6:54     ` Patrick Mitchell
  1 sibling, 1 reply; 12+ messages in thread
From: Inbox @ 2018-06-04  0:19 UTC (permalink / raw)
  To: linux-lvm

On Sun, Jun 3, 2018 at 4:57 PM, Inbox <jimhaddad46@gmail.com> wrote:
> OK, I think I've traced this back to a bug in pvcreate with
> "--pvmetadatacopies 2".  Or, at least a bug in lvm2 2.02.162 - the
> version available back then when I started using this disk.
> ...<snip>

The message sent out first, quoted above, is a reply with more
information.  Below is what was supposed to be the original post.  It
was originally sent as non-plain-text, so it looks like it was
silently ignored and not sent out.


Kernel 4.16.8, lvm 2.02.177.

I'm aware I can't let my thin volume get full.  I'm actually about to
delete a lot of things.



I don't understand why it gave the sdh3 then sdh3, sdg3, sdf3 no space
left on device errors.  sdf3 has 366G left in its thin pool, and I
asked to create a virtual 200G within it.  (EDIT: Now, I see it's the
pvmetadata copy at end of the disk, nothing to do with the thin pools)

I don't understand why it failed to write VG, or an MDA of VG.

I'm most concerned if anything's corrupted now, or if I can ignore
this other than that I couldn't create a volume.

disk1thin is on sdh3
disk2thin is on sdg3
disk3thin is on sdf3
disk4thin is on sde3

# lvs
...
  disk1thin                       lvm    twi-aot---  <4.53t
         84.13  76.33
  disk2thin                       lvm    twi-aot---  <4.53t
         85.98  78.09
  disk3thin                       lvm    twi-aot---  <4.53t
         92.10  83.47
  disk4thin                       lvm    twi-aot---   4.53t
         80.99  36.91
...
# lvcreate -V200G lvm/disk3thin -n test3
  WARNING: Sum of all thin volume sizes (21.22 TiB) exceeds the size
of thin pools and the size of whole volume group (<18.17 TiB).
  WARNING: You have not turned on protection against thin pools
running out of space.
  WARNING: Set activation/thin_pool_autoextend_threshold below 100 to
trigger automatic extension of thin pools before they get full.
  /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No
space left on device
  Failed to write VG lvm.
  Failed to write VG lvm.
  Manual intervention may be required to remove abandoned LV(s) before retrying.
# lvremove lvm/test3
  /dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No
space left on device
  WARNING: Failed to write an MDA of VG lvm.
  /dev/sdg3: write failed after 24064 of 24576 at 4993488961536: No
space left on device
  WARNING: Failed to write an MDA of VG lvm.
  /dev/sdf3: write failed after 24064 of 24576 at 4993488961536: No
space left on device
  WARNING: Failed to write an MDA of VG lvm.
  Logical volume "test3" successfully removed
# lvs --- shows test3 is gone

# pvs
  PV             VG     Fmt  Attr PSize    PFree
  /dev/sde3      lvm    lvm2 a--     4.54t <10.70g
  /dev/sdf3      lvm    lvm2 a--     4.54t      0
  /dev/sdg3      lvm    lvm2 a--     4.54t      0
  /dev/sdh3      lvm    lvm2 a--     4.54t      0
# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  lvm      4  51   0 wz--n- <18.17t <10.70g



fdisk -l  shows sdf3, sdg3, and sdh3, sdg3, are 4993488985600 bytes.

After some reading, I'm guessing mda means metadata area.

If LVM is trying to write mda2 at 4993488961536, there's only 24064
bytes left in the partition, which is exactly where it's saying the
write is failing.

I did use "--pvmetadatacopies 2" when running pvcreate.

So, am I right that it's trying to write more than 24k of metadata at
the end of the disk, but there's only 24k left@the end of the disk
for a metadata copy?

If so, where the other locations for metadata (both the main one, and
the first copy) only 24k?  Did I lose metadata in those areas?  Did it
overwrite what was after the metadata area?

Where do I go from here?

# pvdisplay --maps /dev/sdh3
  --- Physical volume ---
  PV Name               /dev/sdh3
  VG Name               lvm
  PV Size               4.54 TiB / not usable 2.19 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              1190540
  Free PE               0
  Allocated PE          1190540
  PV UUID               BK8suJ-dqiy-mdK4-IyUH-aeRR-TPUo-3Dj6JG

  --- Physical Segments ---
  Physical extent 0 to 127:
    Logical volume      /dev/lvm/disk1thin_tmeta
    Logical extents     32 to 159
  Physical extent 128 to 4095:
    Logical volume      /dev/lvm/swap1
    Logical extents     0 to 3967
  Physical extent 4096 to 4127:
    Logical volume      /dev/lvm/lvol0_pmspare
    Logical extents     0 to 31
  Physical extent 4128 to 132127:
    Logical volume      /dev/lvm/disk1thin_tdata
    Logical extents     0 to 127999
  Physical extent 132128 to 132159:
    Logical volume      /dev/lvm/disk1thin_tmeta
    Logical extents     0 to 31
  Physical extent 132160 to 1190539:
    Logical volume      /dev/lvm/disk1thin_tdata
    Logical extents     128000 to 1186379

Those physical extents translate to beginning/ending bytes (4MiB PE size) of:

0 532676608
536870912 17175674880
17179869184 17309892608
17314086912 554180804608
554184998912 554315022336
554319216640 4993482489856


Disk size (4993488961536) - last physical extent ending byte
(4993482489856 + 4*1024*1024 assuming worst case scenario here, the
x856 is the starting byte of the last extent) still leaves 2277376
bytes, way more than 24k.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-04  0:19     ` Inbox
@ 2018-06-04  0:46       ` Jim Haddad
  2018-06-04  2:47         ` Jim Haddad
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Haddad @ 2018-06-04  0:46 UTC (permalink / raw)
  To: linux-lvm

On Sun, Jun 3, 2018 at 5:19 PM, Inbox <jimhaddad46@gmail.com> wrote:
> Kernel 4.16.8, lvm 2.02.177.

Again, I setup this disk in 2016 using lvm 2.02.162.

I tried reproducing the "--pvmetadatacopies 2" bug in a VM, and did
not, so I presume the calculation of the offset and size to make mda1
was fixed somewhere between 2.02.162 - 2.02.177.  That, or the bug
might still happen on a 4.53t (5TB) disk, but not on my 40G VM.

In the VM, the output of pvdissect is below.  mda0 size is 0xff000,
and mda1.size is 0x100000.  So, mda1 is actually slightly bigger by
4096 bytes, but as long as the circular buffers are handled
independently and these don't need to be the same, that's OK.

The previous problem was that the XML area was being shrunk from
966,656 bytes to 48,640 (so mda1 could hold a whopping 5% as much xml
as mda0.)

But, kernel 4.16.8, lvm 2.02.177 is still trying to write past the
disk given a really small mda1.

I'd be fine if I could run "vgconvert --pvmetadatacopies 1" and forget
the one at the end of the disk.  I don't know if that could be
dangerous to run though, in this situation especially.

(I'm thinking the alternative would be getting another drive, running
the new pvcreate on it, and copying everything over.  Since, I can't
shrink the existing thin pool to make room for a bigger mda1, not that
there's a way to really expand it anyway.  I'd rather lose the extra
copy.)



# python2 pvdissect
0x00000200 (label_header.id):
    LABELONE
0x00000208 (label_header.sector):
1
0x00000210 (label_header.crc):
    0xfa4129bd
0x00000214 (label_header.offset):
    0x20
0x00000218 (label_header.type):
    LVM2 001
0x00000220 (pv_header.uuid):
    SHLDkE-wyYu-2xFJ-jhA9-armj-WOFS-X3d7Sh
0x00000240 (pv_header.device_size):
    0x9fff00000
0x00000248 (pv_header.disk_areas.da0.offset):
    0x100000
0x00000250 (pv_header.disk_areas.da0.size):
    0x0
0x00000268 (pv_header.disk_areas.mda0.offset):
    0x1000
0x00000270 (pv_header.disk_areas.mda0.size):
    0xff000
0x00000278 (pv_header.disk_areas.mda1.offset):
    0x9ffe00000
0x00000280 (pv_header.disk_areas.mda1.size):
    0x100000
0x00001000 (mda_header.checksum):
    0xb4cd28c6
0x00001004 (mda_header.magic):
     LVM2 x[5A%r0N*>
0x00001014 (mda_header.version):
1
0x00001018 (mda_header.start):
    0x1000
0x00001020 (mda_header.size):
    0xff000
0x00001028 (mda_header.raw_locns0.offset):
    0x2000
0x00001030 (mda_header.raw_locns0.size):
    0x3e6
0x00001038 (mda_header.raw_locns0.checksum):
    0x1f264f08
0x0000103c (mda_header.raw_locns0.flags):
0
0x00001028 (mda_header.raw_locns0.offset):
    0x2000
0x00001030 (mda_header.raw_locns0.size):
    0x3e6
0x00001038 (mda_header.raw_locns0.checksum):
    0x1f264f08
0x0000103c (mda_header.raw_locns0.flags):
0
0x9ffe00000 (mda_header.checksum):
    0x566e3e24
0x9ffe00004 (mda_header.magic):
     LVM2 x[5A%r0N*>
0x9ffe00014 (mda_header.version):
1
0x9ffe00018 (mda_header.start):
    0x9ffe00000
0x9ffe00020 (mda_header.size):
    0x100000
0x9ffe00028 (mda_header.raw_locns0.offset):
    0x2000
0x9ffe00030 (mda_header.raw_locns0.size):
    0x3e6
0x9ffe00038 (mda_header.raw_locns0.checksum):
    0x1f264f08
0x9ffe0003c (mda_header.raw_locns0.flags):
0
0x9ffe00028 (mda_header.raw_locns0.offset):
    0x2000
0x9ffe00030 (mda_header.raw_locns0.size):
    0x3e6
0x9ffe00038 (mda_header.raw_locns0.checksum):
    0x1f264f08
0x9ffe0003c (mda_header.raw_locns0.flags):
0
0x00003000 (metadata.value):
...
0x9ffe02000 (metadata.value):
...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-04  0:46       ` Jim Haddad
@ 2018-06-04  2:47         ` Jim Haddad
  2018-06-04 15:34           ` David Teigland
  0 siblings, 1 reply; 12+ messages in thread
From: Jim Haddad @ 2018-06-04  2:47 UTC (permalink / raw)
  To: linux-lvm, teigland, agk, ejt

Writing past the end of the disk seems to be fixed in git master.

Looks like the culprit is lib/format_text/format-text.c::_vg_write_raw().

I think the past the end of the disk bug got added in the days before
2.02.177 was released by some of the Alasdair G Kergon commits to the
function regarding wrapping and rounding.  I am not 100% positive of
this.

I think this commit reverting those changes FIXES the bug:

commit 00f1b208a1bf44665ec97a791355b1fcf525a3a7
Author: Joe Thornber <ejt@redhat.com>
Date:   Fri Apr 20 10:43:50 2018 -0500

    [io paths] Unpick agk's aio stuff

It also might be fixed/helped by David Tiegland's commits regarding bcache.



Hoping I understood the situation well enough that it wouldn't cause
harm, using 2.02.177, I ran:

# lvcreate -ddddddvvvv -V200G lvm/disk3thin -n test3
...
#device/dev-io.c:654           Closed /dev/sdh3
#device/dev-io.c:599           Opened /dev/sdh3 RW O_DIRECT
#device/dev-io.c:168           /dev/sdh3: Block size is 512 bytes
#device/dev-io.c:179           /dev/sdh3: Physical block size is 4096 bytes
#device/dev-io.c:96            Read  /dev/sdh3:     512 bytes (sync)
at 4096 (for VG metadata header)
#device/dev-io.c:255           Widening request for 130 bytes at 81920
to 512 bytes at 81920 on /dev/sdh3 (for VG metadata content)
#device/dev-io.c:96            Read  /dev/sdh3:     512 bytes (sync)
at 81920 (for VG metadata content)
#format_text/format-text.c:799           Writing lvm metadata to
/dev/sdh3 at 106496 len 20934 (rounded to 24576) of 20934 aligned to
4096
#device/dev-io.c:96            Write /dev/sdh3:   24576 bytes (sync)
at 106496 (for VG metadata content)
#device/dev-io.c:96            Read  /dev/sdh3:     512 bytes (sync)
at 4993488781312 (for extra VG metadata header)
#device/dev-io.c:255           Widening request for 130 bytes at
4993488936960 to 512 bytes at 4993488936960 on /dev/sdh3 (for extra VG
metadata content)
#device/dev-io.c:96            Read  /dev/sdh3:     512 bytes (sync)
at 4993488936960 (for extra VG metadata content)
#format_text/format-text.c:799           Writing lvm metadata to
/dev/sdh3 at 4993488961536 len 20934 (rounded to 24576) of 20934
aligned to 4096
#device/dev-io.c:96            Write /dev/sdh3:   24576 bytes (sync)
at 4993488961536 (for extra VG metadata content)
#device/dev-io.c:129     /dev/sdh3: write failed after 24064 of 24576
at 4993488961536: No space left on device
#device/dev-io.c:288           <backtrace>
#format_text/format-text.c:806           <backtrace>
#metadata/metadata.c:3055          <backtrace>
#metadata/metadata.c:3064    Failed to write VG lvm.
#device/dev-io.c:96            Read  /dev/sdh3:     512 bytes (sync)
at 4096 (for VG metadata header)
#device/dev-io.c:255           Widening request for 130 bytes at 81920
to 512 bytes at 81920 on /dev/sdh3 (for VG metadata content)
#device/dev-io.c:96            Read  /dev/sdh3:     512 bytes (sync)
at 81920 (for VG metadata content)
#format_text/format-text.c:920           Wiping pre-committed lvm
metadata from /dev/sdh3 header at 4096
#device/dev-io.c:96            Write /dev/sdh3:     512 bytes (sync)
at 4096 (for VG metadata header)
#metadata/lv_manip.c:7802          <backtrace>
#metadata/lv_manip.c:8078          <backtrace>
#lvcreate.c:1652          <backtrace>
#toollib.c:1987          <backtrace>
...



With git master, I ran the same command.  It no longer says exactly
how much and where it's writing, just the header address.  But, it
doesn't give an error, so I'm hoping it's properly handling the
situation again:
...
#format_text/format-text.c:331           Reading mda header sector
from /dev/sdh3 at 4096
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sdh3 header at 4096
#format_text/format-text.c:331           Reading mda header sector
from /dev/sdh3 at 4993488781312
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sdh3 header at 4993488781312
#format_text/format-text.c:331           Reading mda header sector
from /dev/sdg3 at 4096
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sdg3 header at 4096
#format_text/format-text.c:331           Reading mda header sector
from /dev/sdg3 at 4993488781312
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sdg3 header at 4993488781312
#format_text/format-text.c:331           Reading mda header sector
from /dev/sdf3 at 4096
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sdf3 header at 4096
#format_text/format-text.c:331           Reading mda header sector
from /dev/sdf3 at 4993488781312
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sdf3 header at 4993488781312
#format_text/format-text.c:331           Reading mda header sector
from /dev/sde3 at 4096
#format_text/format-text.c:790           Committing lvm metadata (550)
to /dev/sde3 header at 4096
...



So, I'm thinking I can upgrade to git master, or at least 178-rc1, and
leave mda1 incredibly small.  While knowing that if my VG XML a bit
more than doubles beyond the 48,640 available in mda1 for it, I'd need
to run "vgconvert --pvmetadatacopies 1" or move the data off and back
on.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-03 23:57   ` Inbox
  2018-06-04  0:19     ` Inbox
@ 2018-06-04  6:54     ` Patrick Mitchell
  1 sibling, 0 replies; 12+ messages in thread
From: Patrick Mitchell @ 2018-06-04  6:54 UTC (permalink / raw)
  To: LVM general discussion and development, teigland, agk, ejt

On Sun, Jun 3, 2018 at 7:57 PM, Inbox <jimhaddad46@gmail.com> wrote:
...
> Ordinarily, I don't think this would be fatal.  If lvm works within
> the space it has, this just means not as many old copies of metadata
> will be kept.  But, the pvcreate bug left room for only 48,640 bytes
> of xml in mda1 vs 966,656 in mda1.  As my "lvm = {" xml is 20,624
> bytes, there's only room for 2 copies of the xml in mda1.
>
> It must be this combination of too small of an xml area in mda1, with
> a large "lvm = {" xml that doesn't allow LVM to work within such a
> confined space, and try to write past the end of the disk.
>
>
> The output way below below shows:
>
> * disk size is correct (pv_header.device_size 0x48aa3231e00 is
> 4993488985600, bytes reported by fdisk)
> * mda0 is located at 4096 bytes (pv_header.disk_areas.mda0.offset
> 0x1000 is 4096 bytes)
> * mda0 is size 1044480 bytes (pv_header.disk_areas.md0.size 0xff000)
> * mda1 is located at 4993488781312 bytes which is 204288 from last
> disk byte (pv_header.disk_areas.mda1.offset 0x48aa3200000)
> * mda1 is size 204288 bytes (pv_header.disk_areas.mda1.size 0x31e00)
> * the mda checksums are now different (0xf0662726 vs 0xb46ba552)
>
> So, it made mda1 to only be 19.5~% the size of mda0.
>
> mda0 has room for xml of 966656 bytes.  (starts at 0x14000, mda0 goes
> from 0x1000 for 0xff000 bytes, so to 0x100000 = EC000 available =
> 966656)
>
> md1 only has room for xml of 48640 bytes.  (starts at 0x48aa3226000,
> mda1 goes from 0x48aa3200000 for 0x31e00 bytes, so to 48AA3231E00 =
> BE00 available = 48640)
...

Correction here.

I thought the python script's addresses for metadata.value were at the
starting position for XML.  I was wrong about that.  I see now those
point to mda_header.start + mda.header_raw_locns0.offset.  locns[0]
I'm guessing must be for the most recent.

So, the XML area isn't shrunk down as badly like I was thinking.  mda1
is still smaller:

# pvck -t /dev/sdh3
  TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.
  Found label on /dev/sdh3, sector 1, type=LVM2 001
  Found text metadata area: offset=4096, size=1044480
  Found text metadata area: offset=4993488781312, size=204288

But, there is more room in mda1 than just 2 XML files.  It must have
just been the exact math on the mda1 size, the XML size, the rounding,
and the 2.02.177 algorithm, that made it try to write off the disk.
LVM isn't stuck trying to fit 2 XML's in the area.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-04  2:47         ` Jim Haddad
@ 2018-06-04 15:34           ` David Teigland
  2018-06-04 16:35             ` David Teigland
  2018-06-04 18:26             ` Jim Haddad
  0 siblings, 2 replies; 12+ messages in thread
From: David Teigland @ 2018-06-04 15:34 UTC (permalink / raw)
  To: Jim Haddad; +Cc: linux-lvm

On Sun, Jun 03, 2018 at 07:47:35PM -0700, Jim Haddad wrote:
> Writing past the end of the disk seems to be fixed in git master.

> Hoping I understood the situation well enough that it wouldn't cause
> harm, using 2.02.177, I ran:

You'll notice some ongoing changes with releases and branches.  I'd
suggest using 2.02.176 and 2.02.178 (skip 2.02.177).  If you want to use a
git branch directly, you may want to look at 2018-06-01-stable since the
master branch may be unstable for a while.

As you suggest, the changes in 2.02.177 related to rounding up write sizes
seems to be your problem with writing beyond the end of the device:

> #format_text/format-text.c:799
  Writing lvm metadata to /dev/sdh3 at 106496 len 20934
  (rounded to 24576) of 20934 aligned to 4096

> #device/dev-io.c:96
  Write /dev/sdh3:   24576 bytes (sync) at 106496

> #format_text/format-text.c:799
  Writing lvm metadata to /dev/sdh3 at 4993488961536 len 20934
  (rounded to 24576) of 20934 aligned to 4096

> #device/dev-io.c:96
  Write /dev/sdh3:   24576 bytes (sync) at 4993488961536

> #device/dev-io.c:129
  /dev/sdh3: write failed after 24064 of 24576 at 4993488961536:
  No space left on device


> With git master, I ran the same command.  It no longer says exactly
> how much and where it's writing, just the header address. 

You should see more writing debug information than you included, like
this...

The first writes will occur in the metadata areas which will be the
closest to the end of the disk, and where you had the errors above:

format-text.c:331 Reading mda header sector from /dev/sdb at 4096
format-text.c:678 Writing metadata for VG foo to /dev/sdb at 7168 len 1525 (wrap 0)
format-text.c:331 Reading mda header sector from /dev/sdb at 999665172480
format-text.c:678 Writing metadata for VG foo to /dev/sdb at 999665175552 len 1525 (wrap 0)
format-text.c:331 Reading mda header sector from /dev/sdg at 4096
format-text.c:678 Writing metadata for VG foo to /dev/sdg at 7168 len 1525 (wrap 0)
format-text.c:331 Reading mda header sector from /dev/sdg at 999665172480
format-text.c:678 Writing metadata for VG foo to /dev/sdg at 999665175552 len 1525 (wrap 0)

Then the subsequent writes (to precommit/commit the changes) will occur to
the metadata headers which are just a 512 byte sector, not as close to the
end:

format-text.c:331 Reading mda header sector from /dev/sdb at 4096
format-text.c:790 Pre-Committing foo metadata (3) to /dev/sdb header at 4096
format-text.c:331 Reading mda header sector from /dev/sdb at 999665172480
format-text.c:790 Pre-Committing foo metadata (3) to /dev/sdb header at 999665172480
format-text.c:331 Reading mda header sector from /dev/sdg at 4096
format-text.c:790 Pre-Committing foo metadata (3) to /dev/sdg header at 4096
format-text.c:331 Reading mda header sector from /dev/sdg at 999665172480
format-text.c:790 Pre-Committing foo metadata (3) to /dev/sdg header at 999665172480

format-text.c:331 Reading mda header sector from /dev/sdb at 4096
format-text.c:790 Committing foo metadata (3) to /dev/sdb header at 4096
format-text.c:331 Reading mda header sector from /dev/sdb at 999665172480
format-text.c:790 Committing foo metadata (3) to /dev/sdb header at 999665172480
format-text.c:331 Reading mda header sector from /dev/sdg at 4096
format-text.c:790 Committing foo metadata (3) to /dev/sdg header at 4096
format-text.c:331 Reading mda header sector from /dev/sdg at 999665172480
format-text.c:790 Committing foo metadata (3) to /dev/sdg header at 999665172480


> But, it doesn't give an error, so I'm hoping it's properly handling the
> situation again:

Most of the metadata writing changes from 2.02.177 should not be in 178.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-04 15:34           ` David Teigland
@ 2018-06-04 16:35             ` David Teigland
  2018-06-04 18:26             ` Jim Haddad
  1 sibling, 0 replies; 12+ messages in thread
From: David Teigland @ 2018-06-04 16:35 UTC (permalink / raw)
  To: Jim Haddad; +Cc: linux-lvm

On Mon, Jun 04, 2018 at 10:34:34AM -0500, David Teigland wrote:
> format-text.c:331 Reading mda header sector from /dev/sdb at 4096
> format-text.c:678 Writing metadata for VG foo to /dev/sdb at 7168 len 1525 (wrap 0)
> format-text.c:331 Reading mda header sector from /dev/sdb at 999665172480
> format-text.c:678 Writing metadata for VG foo to /dev/sdb at 999665175552 len 1525 (wrap 0)
> format-text.c:331 Reading mda header sector from /dev/sdg at 4096
> format-text.c:678 Writing metadata for VG foo to /dev/sdg at 7168 len 1525 (wrap 0)
> format-text.c:331 Reading mda header sector from /dev/sdg at 999665172480
> format-text.c:678 Writing metadata for VG foo to /dev/sdg at 999665175552 len 1525 (wrap 0)

To illustrate what you should see when the metadata wraps, using the default
metadata area size:

The initial metadata written by vgcreate:

Reading mda header sector from /dev/sdb at 4096
Writing metadata for VG foo to /dev/sdb at 4608 len 931 (wrap 0)
Reading mda header sector from /dev/sdb at 999665172480
Writing metadata for VG foo to /dev/sdb at 999665172992 len 931 (wrap 0)
Reading mda header sector from /dev/sdg at 4096
Writing metadata for VG foo to /dev/sdg at 4608 len 931 (wrap 0)
Reading mda header sector from /dev/sdg at 999665172480
Writing metadata for VG foo to /dev/sdg at 999665172992 len 931 (wrap 0)

When writing the metadata wraps, it does not go beyond the end of the device
and it returns to the original offset used above:

Reading mda header sector from /dev/sdb at 4096
Writing metadata for VG foo to /dev/sdb at 1043968 len 4608 (wrap 19895)
Writing metadata for VG foo to /dev/sdb at 4608 len 19895 (wrapped)
Reading mda header sector from /dev/sdb at 999665172480
Writing metadata for VG foo to /dev/sdb at 999666212352 len 8704 (wrap 15799)
Writing metadata for VG foo to /dev/sdb at 999665172992 len 15799 (wrapped)
Reading mda header sector from /dev/sdg at 4096
Writing metadata for VG foo to /dev/sdg at 1043968 len 4608 (wrap 19895)
Writing metadata for VG foo to /dev/sdg at 4608 len 19895 (wrapped)
Reading mda header sector from /dev/sdg at 999665172480
Writing metadata for VG foo to /dev/sdg at 999666212352 len 8704 (wrap 15799)
Writing metadata for VG foo to /dev/sdg at 999665172992 len 15799 (wrapped)

The write that goes to the end of the metadata area starts at 999666212352 and
has length 8704.  start+length is 999666221056 which matches the device size:

# pvs -o name,dev_size --units b /dev/sdb /dev/sdg
  PV         DevSize      
  /dev/sdb   999666221056B
  /dev/sdg   999666221056B

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-04 15:34           ` David Teigland
  2018-06-04 16:35             ` David Teigland
@ 2018-06-04 18:26             ` Jim Haddad
  2018-06-04 18:52               ` David Teigland
  1 sibling, 1 reply; 12+ messages in thread
From: Jim Haddad @ 2018-06-04 18:26 UTC (permalink / raw)
  To: David Teigland; +Cc: linux-lvm

On Mon, Jun 4, 2018 at 8:34 AM, David Teigland <teigland@redhat.com> wrote:
> On Sun, Jun 03, 2018 at 07:47:35PM -0700, Jim Haddad wrote:
>> Writing past the end of the disk seems to be fixed in git master.
>
>> Hoping I understood the situation well enough that it wouldn't cause
>> harm, using 2.02.177, I ran:
>
> You'll notice some ongoing changes with releases and branches.  I'd
> suggest using 2.02.176 and 2.02.178 (skip 2.02.177).  If you want to use a
> git branch directly, you may want to look at 2018-06-01-stable since the
> master branch may be unstable for a while.

Thanks for your replies.  Will do.  I did have some compilation errors
on master, but it was past creating tools/lvm, so I was able to use
that.

>> With git master, I ran the same command.  It no longer says exactly
>> how much and where it's writing, just the header address.
>
> You should see more writing debug information than you included, like
> this...

You're absolutely right, I didn't look high enough up.

Do you think I'm right, that there are no lasting effects from having
ran into problem?  Meaning, if I run 2.02.176/178/2018-06-01-stable
I'm all set, and don't need to copy all the data off the disk and redo
it?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
  2018-06-04 18:26             ` Jim Haddad
@ 2018-06-04 18:52               ` David Teigland
  0 siblings, 0 replies; 12+ messages in thread
From: David Teigland @ 2018-06-04 18:52 UTC (permalink / raw)
  To: Jim Haddad; +Cc: linux-lvm

On Mon, Jun 04, 2018 at 11:26:29AM -0700, Jim Haddad wrote:
> Do you think I'm right, that there are no lasting effects from having
> ran into problem?  Meaning, if I run 2.02.176/178/2018-06-01-stable
> I'm all set, and don't need to copy all the data off the disk and redo
> it?

It's probably ok.  With one of the good versions above, run 'vgs -vvvv'
and check that the offsets look good.  Then run a pointless command to
write the metadata with -vvvv and check that the vg writes are happening
correctly.  "vgchange -vvvv --addtag foo <vgname>" will write a new
version of the metadata and won't have any effect on LVs.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-06-04 18:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-03 20:13 [linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA" Inbox
2018-06-03 20:18 ` Inbox
2018-06-03 22:21 ` Inbox
2018-06-03 23:57   ` Inbox
2018-06-04  0:19     ` Inbox
2018-06-04  0:46       ` Jim Haddad
2018-06-04  2:47         ` Jim Haddad
2018-06-04 15:34           ` David Teigland
2018-06-04 16:35             ` David Teigland
2018-06-04 18:26             ` Jim Haddad
2018-06-04 18:52               ` David Teigland
2018-06-04  6:54     ` Patrick Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).