linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output
@ 2018-05-10 19:30 John Hamilton
  2018-05-11  8:21 ` Joe Thornber
  0 siblings, 1 reply; 5+ messages in thread
From: John Hamilton @ 2018-05-10 19:30 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 1285 bytes --]

I saw something today that I don't understand and I'm hoping somebody can
help.  We had a ~2.5TB thin pool that was showing 69% data utilization in
lvs:

# lvs -a
  LV                    VG       Attr       LSize  Pool Origin Data%
Meta%  Move Log Cpy%Sync Convert
  my-pool         myvg twi-aotz--  2.44t             69.04  4.90
  [my-pool_tdata] myvg Twi-ao----  2.44t
  [my-pool_tmeta] myvg ewi-ao---- 15.81g

However, when I dump the thin pool metadata and look at the mapped_blocks
for the 2 devices in the pool, I can only account for about 950GB.  Here is
the superblock and device entries from the metadata xml.  There are no
other devices listed in the metadata:

<superblock uuid="" time="34" transaction="68" flags="0" version="2"
data_block_size="128" nr_data_blocks="0">
  <device dev_id="1" mapped_blocks="258767" transaction="0"
creation_time="0" snap_time="14">
  <device dev_id="8" mapped_blocks="15616093" transaction="27"
creation_time="15" snap_time="34">

That first device looks like it has about 16GB allocated to it and the
second device about 950GB.  So, I would expect lvs to show somewhere
between 950G-966G Is something wrong, or am I misunderstanding how to read
the metadata dump?  Where is the other 700 or so GB that lvs is showing
used?

Thanks,

John

[-- Attachment #2: Type: text/html, Size: 1788 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output
  2018-05-10 19:30 [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output John Hamilton
@ 2018-05-11  8:21 ` Joe Thornber
  2018-05-11  8:54   ` Marian Csontos
  0 siblings, 1 reply; 5+ messages in thread
From: Joe Thornber @ 2018-05-11  8:21 UTC (permalink / raw)
  To: john.l.hamilton, LVM general discussion and development

On Thu, May 10, 2018 at 07:30:09PM +0000, John Hamilton wrote:
> I saw something today that I don't understand and I'm hoping somebody can
> help.  We had a ~2.5TB thin pool that was showing 69% data utilization in
> lvs:
> 
> # lvs -a
>   LV                    VG       Attr       LSize  Pool Origin Data%
> Meta%  Move Log Cpy%Sync Convert
>   my-pool         myvg twi-aotz--  2.44t             69.04  4.90
>   [my-pool_tdata] myvg Twi-ao----  2.44t
>   [my-pool_tmeta] myvg ewi-ao---- 15.81g
> 
> However, when I dump the thin pool metadata and look at the mapped_blocks
> for the 2 devices in the pool, I can only account for about 950GB.  Here is
> the superblock and device entries from the metadata xml.  There are no
> other devices listed in the metadata:
> 
> <superblock uuid="" time="34" transaction="68" flags="0" version="2"
> data_block_size="128" nr_data_blocks="0">
>   <device dev_id="1" mapped_blocks="258767" transaction="0"
> creation_time="0" snap_time="14">
>   <device dev_id="8" mapped_blocks="15616093" transaction="27"
> creation_time="15" snap_time="34">
> 
> That first device looks like it has about 16GB allocated to it and the
> second device about 950GB.  So, I would expect lvs to show somewhere
> between 950G-966G Is something wrong, or am I misunderstanding how to read
> the metadata dump?  Where is the other 700 or so GB that lvs is showing
> used?

The non zero snap_time suggests that you're using snapshots.  I which case it
could just be there is common data shared between volumes that is getting counted
more than once.

You can confirm this using the thin_ls tool and specifying a format line that
includes EXCLUSIVE_BLOCKS, or SHARED_BLOCKS.  Lvm doesn't take shared blocks into
account because it has to scan all the metadata to calculate what's shared.

- Joe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output
  2018-05-11  8:21 ` Joe Thornber
@ 2018-05-11  8:54   ` Marian Csontos
  2018-05-11 17:09     ` John Hamilton
  0 siblings, 1 reply; 5+ messages in thread
From: Marian Csontos @ 2018-05-11  8:54 UTC (permalink / raw)
  To: john.l.hamilton, LVM general discussion and development

On 05/11/2018 10:21 AM, Joe Thornber wrote:
> On Thu, May 10, 2018 at 07:30:09PM +0000, John Hamilton wrote:
>> I saw something today that I don't understand and I'm hoping somebody can
>> help.  We had a ~2.5TB thin pool that was showing 69% data utilization in
>> lvs:
>>
>> # lvs -a
>>    LV                    VG       Attr       LSize  Pool Origin Data%
>> Meta%  Move Log Cpy%Sync Convert
>>    my-pool         myvg twi-aotz--  2.44t             69.04  4.90
>>    [my-pool_tdata] myvg Twi-ao----  2.44t
>>    [my-pool_tmeta] myvg ewi-ao---- 15.81g

Is this everything? Is this a pool used by docker, which does not (did 
not) use LVM to manage thin-volumes?

>> However, when I dump the thin pool metadata and look at the mapped_blocks
>> for the 2 devices in the pool, I can only account for about 950GB.  Here is
>> the superblock and device entries from the metadata xml.  There are no
>> other devices listed in the metadata:
>>
>> <superblock uuid="" time="34" transaction="68" flags="0" version="2"
>> data_block_size="128" nr_data_blocks="0">
>>    <device dev_id="1" mapped_blocks="258767" transaction="0"
>> creation_time="0" snap_time="14">
>>    <device dev_id="8" mapped_blocks="15616093" transaction="27"
>> creation_time="15" snap_time="34">
>>
>> That first device looks like it has about 16GB allocated to it and the
>> second device about 950GB.  So, I would expect lvs to show somewhere
>> between 950G-966G Is something wrong, or am I misunderstanding how to read
>> the metadata dump?  Where is the other 700 or so GB that lvs is showing
>> used?
> 
> The non zero snap_time suggests that you're using snapshots.  I which case it
> could just be there is common data shared between volumes that is getting counted
> more than once.
> 
> You can confirm this using the thin_ls tool and specifying a format line that
> includes EXCLUSIVE_BLOCKS, or SHARED_BLOCKS.  Lvm doesn't take shared blocks into
> account because it has to scan all the metadata to calculate what's shared.

LVM just queries DM, and displays whatever that provides. You could see 
that in `dmsetup status` output, there are two pairs of '/' separated 
entries - first is metadata usage (USED_BLOCKS/ALL_BLOCKS), second data 
usage (USED_CHUNKS/ALL_CHUNKS).

So the error lies somewhere between dmsetup and kernel.

What is kernel/lvm version?
Is thin_check_executable configured in lvm.conf?

-- Martian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output
  2018-05-11  8:54   ` Marian Csontos
@ 2018-05-11 17:09     ` John Hamilton
  2018-05-16 14:43       ` John Hamilton
  0 siblings, 1 reply; 5+ messages in thread
From: John Hamilton @ 2018-05-11 17:09 UTC (permalink / raw)
  To: Marian Csontos; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 5086 bytes --]

Thanks for the response.

>Is this everything?

Yes, that is everything in the metadata xml dump.  I just removed all of
the *_mapping entries for brevity.  For the lvs output I removed other
logical volumes that aren't related to this pool.

>Is this a pool used by docker, which does not (did not) use LVM to manage
thin-volumes?

It's not docker, but it is an application called serviced that uses
docker's library for managing the volumes

>LVM just queries DM, and displays whatever that provides

Yeah, it looks like dmsetup status output matches lvs:

myvg-my--pool: 0 5242880000 thin-pool 70 207941/4145152
29018611/40960000 - rw discard_passdown queue_if_no_space -
myvg-my--pool_tdata: 0 4194304000 <(419)%20430-4000> linear
myvg-my--pool_tdata: 4194304000 1048576000 linear
myvg-my--pool_tmeta: 0 33161216 linear

>What is kernel/lvm version?

# uname -r
3.10.0-693.21.1.el7.x86_64

# lvm version
LVM version:     2.02.171(2)-RHEL7 (2017-05-03)
Library version: 1.02.140-RHEL7 (2017-05-03)
Driver version:  4.35.0
Configuration:   ./configure --build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr --exec-prefix=/usr
--bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
--datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64
--libexecdir=/usr/libexec --localstatedir=/var
--sharedstatedir=/var/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-default-dm-run-dir=/run
--with-default-run-dir=/run/lvm --with-default-pid-dir=/run
--with-default-locking-dir=/run/lock/lvm --with-usrlibdir=/usr/lib64
--enable-lvm1_fallback --enable-fsadm --with-pool=internal
--enable-write_install --with-user= --with-group= --with-device-uid=0
--with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig
--enable-applib --enable-cmdlib --enable-dmeventd
--enable-blkid_wiping --enable-python2-bindings
--with-cluster=internal --with-clvmd=corosync --enable-cmirrord
--with-udevdir=/usr/lib/udev/rules.d --enable-udev_sync
--with-thin=internal --enable-lvmetad --with-cache=internal
--enable-lvmpolld --enable-lvmlockd-dlm --enable-lvmlockd-sanlock
--enable-dmfilemapd

>Is thin_check_executable configured in lvm.conf?

Yes

I also just found out that they apparently ran thin_check recently and got
a message about a corrupt superblock, but didn't repair it.  They were
still able to re-activate the pool though. We'll run a repair as soon as we
get a chance and see if that fixes it.

Thanks,

John

On Fri, May 11, 2018 at 3:54 AM Marian Csontos <mcsontos@redhat.com> wrote:

> On 05/11/2018 10:21 AM, Joe Thornber wrote:
> > On Thu, May 10, 2018 at 07:30:09PM +0000, John Hamilton wrote:
> >> I saw something today that I don't understand and I'm hoping somebody
> can
> >> help.  We had a ~2.5TB thin pool that was showing 69% data utilization
> in
> >> lvs:
> >>
> >> # lvs -a
> >>    LV                    VG       Attr       LSize  Pool Origin Data%
> >> Meta%  Move Log Cpy%Sync Convert
> >>    my-pool         myvg twi-aotz--  2.44t             69.04  4.90
> >>    [my-pool_tdata] myvg Twi-ao----  2.44t
> >>    [my-pool_tmeta] myvg ewi-ao---- 15.81g
>
> Is this everything? Is this a pool used by docker, which does not (did
> not) use LVM to manage thin-volumes?
>
> >> However, when I dump the thin pool metadata and look at the
> mapped_blocks
> >> for the 2 devices in the pool, I can only account for about 950GB.
> Here is
> >> the superblock and device entries from the metadata xml.  There are no
> >> other devices listed in the metadata:
> >>
> >> <superblock uuid="" time="34" transaction="68" flags="0" version="2"
> >> data_block_size="128" nr_data_blocks="0">
> >>    <device dev_id="1" mapped_blocks="258767" transaction="0"
> >> creation_time="0" snap_time="14">
> >>    <device dev_id="8" mapped_blocks="15616093" transaction="27"
> >> creation_time="15" snap_time="34">
> >>
> >> That first device looks like it has about 16GB allocated to it and the
> >> second device about 950GB.  So, I would expect lvs to show somewhere
> >> between 950G-966G Is something wrong, or am I misunderstanding how to
> read
> >> the metadata dump?  Where is the other 700 or so GB that lvs is showing
> >> used?
> >
> > The non zero snap_time suggests that you're using snapshots.  I which
> case it
> > could just be there is common data shared between volumes that is
> getting counted
> > more than once.
> >
> > You can confirm this using the thin_ls tool and specifying a format line
> that
> > includes EXCLUSIVE_BLOCKS, or SHARED_BLOCKS.  Lvm doesn't take shared
> blocks into
> > account because it has to scan all the metadata to calculate what's
> shared.
>
> LVM just queries DM, and displays whatever that provides. You could see
> that in `dmsetup status` output, there are two pairs of '/' separated
> entries - first is metadata usage (USED_BLOCKS/ALL_BLOCKS), second data
> usage (USED_CHUNKS/ALL_CHUNKS).
>
> So the error lies somewhere between dmsetup and kernel.
>
> What is kernel/lvm version?
> Is thin_check_executable configured in lvm.conf?
>
> -- Martian
>

[-- Attachment #2: Type: text/html, Size: 9478 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output
  2018-05-11 17:09     ` John Hamilton
@ 2018-05-16 14:43       ` John Hamilton
  0 siblings, 0 replies; 5+ messages in thread
From: John Hamilton @ 2018-05-16 14:43 UTC (permalink / raw)
  To: Marian Csontos; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 5575 bytes --]

So it turns out simply running lvconvert --repair fixed the issue and lvs
is now reporting the correct utilization.

On Fri, May 11, 2018 at 12:09 PM John Hamilton <john.l.hamilton@gmail.com>
wrote:

> Thanks for the response.
>
> >Is this everything?
>
> Yes, that is everything in the metadata xml dump.  I just removed all of
> the *_mapping entries for brevity.  For the lvs output I removed other
> logical volumes that aren't related to this pool.
>
> >Is this a pool used by docker, which does not (did not) use LVM to
> manage thin-volumes?
>
> It's not docker, but it is an application called serviced that uses
> docker's library for managing the volumes
>
> >LVM just queries DM, and displays whatever that provides
>
> Yeah, it looks like dmsetup status output matches lvs:
>
> myvg-my--pool: 0 5242880000 thin-pool 70 207941/4145152 29018611/40960000 - rw discard_passdown queue_if_no_space -
> myvg-my--pool_tdata: 0 4194304000 <(419)%20430-4000> linear
> myvg-my--pool_tdata: 4194304000 <(419)%20430-4000> 1048576000 linear
> myvg-my--pool_tmeta: 0 33161216 linear
>
> >What is kernel/lvm version?
>
> # uname -r
> 3.10.0-693.21.1.el7.x86_64
>
> # lvm version
> LVM version:     2.02.171(2)-RHEL7 (2017-05-03)
> Library version: 1.02.140-RHEL7 (2017-05-03)
> Driver version:  4.35.0
> Configuration:   ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-default-dm-run-dir=/run --with-default-run-dir=/run/lvm --with-default-pid-dir=/run --with-default-locking-dir=/run/lock/lvm --with-usrlibdir=/usr/lib64 --enable-lvm1_fallback --enable-fsadm --with-pool=internal --enable-write_install --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig --enable-applib --enable-cmdlib --enable-dmeventd --enable-blkid_wiping --enable-python2-bindings --with-cluster=internal --with-clvmd=corosync --enable-cmirrord --with-udevdir=/usr/lib/udev/rules.d --enable-udev_sync --with-thin=internal --enable-lvmetad --with-cache=internal --enable-lvmpolld --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-dmfilemapd
>
> >Is thin_check_executable configured in lvm.conf?
>
> Yes
>
> I also just found out that they apparently ran thin_check recently and got
> a message about a corrupt superblock, but didn't repair it.  They were
> still able to re-activate the pool though. We'll run a repair as soon as we
> get a chance and see if that fixes it.
>
> Thanks,
>
> John
>
> On Fri, May 11, 2018 at 3:54 AM Marian Csontos <mcsontos@redhat.com>
> wrote:
>
>> On 05/11/2018 10:21 AM, Joe Thornber wrote:
>> > On Thu, May 10, 2018 at 07:30:09PM +0000, John Hamilton wrote:
>> >> I saw something today that I don't understand and I'm hoping somebody
>> can
>> >> help.  We had a ~2.5TB thin pool that was showing 69% data utilization
>> in
>> >> lvs:
>> >>
>> >> # lvs -a
>> >>    LV                    VG       Attr       LSize  Pool Origin Data%
>> >> Meta%  Move Log Cpy%Sync Convert
>> >>    my-pool         myvg twi-aotz--  2.44t             69.04  4.90
>> >>    [my-pool_tdata] myvg Twi-ao----  2.44t
>> >>    [my-pool_tmeta] myvg ewi-ao---- 15.81g
>>
>> Is this everything? Is this a pool used by docker, which does not (did
>> not) use LVM to manage thin-volumes?
>>
>> >> However, when I dump the thin pool metadata and look at the
>> mapped_blocks
>> >> for the 2 devices in the pool, I can only account for about 950GB.
>> Here is
>> >> the superblock and device entries from the metadata xml.  There are no
>> >> other devices listed in the metadata:
>> >>
>> >> <superblock uuid="" time="34" transaction="68" flags="0" version="2"
>> >> data_block_size="128" nr_data_blocks="0">
>> >>    <device dev_id="1" mapped_blocks="258767" transaction="0"
>> >> creation_time="0" snap_time="14">
>> >>    <device dev_id="8" mapped_blocks="15616093" transaction="27"
>> >> creation_time="15" snap_time="34">
>> >>
>> >> That first device looks like it has about 16GB allocated to it and the
>> >> second device about 950GB.  So, I would expect lvs to show somewhere
>> >> between 950G-966G Is something wrong, or am I misunderstanding how to
>> read
>> >> the metadata dump?  Where is the other 700 or so GB that lvs is showing
>> >> used?
>> >
>> > The non zero snap_time suggests that you're using snapshots.  I which
>> case it
>> > could just be there is common data shared between volumes that is
>> getting counted
>> > more than once.
>> >
>> > You can confirm this using the thin_ls tool and specifying a format
>> line that
>> > includes EXCLUSIVE_BLOCKS, or SHARED_BLOCKS.  Lvm doesn't take shared
>> blocks into
>> > account because it has to scan all the metadata to calculate what's
>> shared.
>>
>> LVM just queries DM, and displays whatever that provides. You could see
>> that in `dmsetup status` output, there are two pairs of '/' separated
>> entries - first is metadata usage (USED_BLOCKS/ALL_BLOCKS), second data
>> usage (USED_CHUNKS/ALL_CHUNKS).
>>
>> So the error lies somewhere between dmsetup and kernel.
>>
>> What is kernel/lvm version?
>> Is thin_check_executable configured in lvm.conf?
>>
>> -- Martian
>>
>

[-- Attachment #2: Type: text/html, Size: 10142 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-16 14:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-10 19:30 [linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output John Hamilton
2018-05-11  8:21 ` Joe Thornber
2018-05-11  8:54   ` Marian Csontos
2018-05-11 17:09     ` John Hamilton
2018-05-16 14:43       ` John Hamilton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).