All of lore.kernel.org
 help / color / mirror / Atom feed
* cache support
@ 2014-02-05  1:51 Paul B. Henson
  2014-02-05  7:35 ` Oliver Rath
  2014-02-05  9:35 ` Zdenek Kabelac
  0 siblings, 2 replies; 18+ messages in thread
From: Paul B. Henson @ 2014-02-05  1:51 UTC (permalink / raw)
  To: lvm-devel

So I was browsing through the last month or so of list archives and
reviewing the cache support under development, and had a question regarding
use cases. Unless I am misunderstanding, it looks like the support being
built so far addresses the use case of taking two lv's (a large slow origin
lv and a small fast cache lv) and combining them into a third lv.

Is there any intention to support a use case where you can attach a cache to
the underlying pv, and have every single lv created on that pv cached? On a
virtualization server I'm working on, I have an md raid1 of two 256G SSD's,
and an md raid10 of four 2TB hard drives. What I'd like to do is create a
cache device consisting of those two raid devices, and create a pv/vg on top
of that. I know that is possible with the raw underlying dm-cache
implementation, but it didn't look like the initial code dropped in lvm so
far would support something like that?

Thanks.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-05  1:51 cache support Paul B. Henson
@ 2014-02-05  7:35 ` Oliver Rath
  2014-02-05 20:12   ` Paul B. Henson
  2014-02-05  9:35 ` Zdenek Kabelac
  1 sibling, 1 reply; 18+ messages in thread
From: Oliver Rath @ 2014-02-05  7:35 UTC (permalink / raw)
  To: lvm-devel

Hi Paul,

you can create a cache-device with dm-cache, i.e. /dev/mapper/cache0,
which consists your underlying disks cached by your ssd. The critical
point is imho, that you have to create the cache device first, then the
PV on the *cache device* cache0, then to create the VG consisting on
cache0, getting the right name for this. As result you get the disk ids
twice, so you have to filter the ids by lvm.conf.

@list: Is there a solution for this "doubling" case?

Alternativly you can use bcache, which needs a "superblock" on both
devices, so that you arent offended by doubling ID. Or you can try
enhanceIO (not in standard kernel yet), which caches your device
transparently without using the dm-mechanism and creating a virtual new
device.

Hth
Oliver


On 05.02.2014 02:51, Paul B. Henson wrote:
> So I was browsing through the last month or so of list archives and
> reviewing the cache support under development, and had a question regarding
> use cases. Unless I am misunderstanding, it looks like the support being
> built so far addresses the use case of taking two lv's (a large slow origin
> lv and a small fast cache lv) and combining them into a third lv.
>
> Is there any intention to support a use case where you can attach a cache to
> the underlying pv, and have every single lv created on that pv cached? On a
> virtualization server I'm working on, I have an md raid1 of two 256G SSD's,
> and an md raid10 of four 2TB hard drives. What I'd like to do is create a
> cache device consisting of those two raid devices, and create a pv/vg on top
> of that. I know that is possible with the raw underlying dm-cache
> implementation, but it didn't look like the initial code dropped in lvm so
> far would support something like that?
>
> Thanks.
>
> --
> lvm-devel mailing list
> lvm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/lvm-devel



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-05  1:51 cache support Paul B. Henson
  2014-02-05  7:35 ` Oliver Rath
@ 2014-02-05  9:35 ` Zdenek Kabelac
  2014-02-05 20:20   ` Paul B. Henson
  1 sibling, 1 reply; 18+ messages in thread
From: Zdenek Kabelac @ 2014-02-05  9:35 UTC (permalink / raw)
  To: lvm-devel

Dne 5.2.2014 02:51, Paul B. Henson napsal(a):
> So I was browsing through the last month or so of list archives and
> reviewing the cache support under development, and had a question regarding
> use cases. Unless I am misunderstanding, it looks like the support being
> built so far addresses the use case of taking two lv's (a large slow origin
> lv and a small fast cache lv) and combining them into a third lv.
>
> Is there any intention to support a use case where you can attach a cache to
> the underlying pv, and have every single lv created on that pv cached? On a
> virtualization server I'm working on, I have an md raid1 of two 256G SSD's,
> and an md raid10 of four 2TB hard drives. What I'd like to do is create a
> cache device consisting of those two raid devices, and create a pv/vg on top
> of that. I know that is possible with the raw underlying dm-cache
> implementation, but it didn't look like the initial code dropped in lvm so
> far would support something like that?


Well - there is work in progress in upstream git - but it's highly 
'experimental' and its user-space API can change any minute - so it's only 
useful for playing - but not for any real use yet.

The device stacking is getting quite complex.

Side note - lvm2 now supports it's own metadata format for md raid1 - this 
should allow better handling of device stack (it's using same kernel driver as 
mdraid) - use just a single command to active everything in proper order.

Current version of dm-cache supports  only  1:1 mapping - so one large cache 
shared by multiple LVs is not supported. You will need to prepare smaller 
individual cache pools for each of your LV.

lvm2 side is designed to allow to play with more different cache targets in 
future.

Zdenek



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-05  7:35 ` Oliver Rath
@ 2014-02-05 20:12   ` Paul B. Henson
  0 siblings, 0 replies; 18+ messages in thread
From: Paul B. Henson @ 2014-02-05 20:12 UTC (permalink / raw)
  To: lvm-devel

> From: Oliver Rath
> Sent: Tuesday, February 04, 2014 11:36 PM
>
> you can create a cache-device with dm-cache, i.e. /dev/mapper/cache0,
> which consists your underlying disks cached by your ssd.

I investigated that, although I never prototyped it. Using dm-cache directly
involves size calculations and an annoying number of messy manual
operations, I was hoping lvm would hide most of that from the user level and
make it easier to do :).

> cache0, getting the right name for this. As result you get the disk ids
> twice, so you have to filter the ids by lvm.conf.
> 
> @list: Is there a solution for this "doubling" case?

I've deployed lvm on mdraid in the past, specifically on mirrors using the
older metadata format such that each half of the mirror appears to be a pv
on its own. I never fiddled with filters, but somehow lvm figured out that
it should use the /dev/md device for the pv and not the raw /dev/sd
components.

> Alternativly you can use bcache, which needs a "superblock" on both

I looked at bcache, which is relatively easy to use and does match my use
case. However, after hanging out on the mailing list for a while, the
developers didn't seem to reliably respond to postings of crashes or data
corruption or other issues, which made me somewhat leery of deployment in
production. On the device mapper mailing list, the developers seem to
respond to pretty much every question or problem report about dm-cache,
which gives one a little bit more of a warm fuzzy ;).

> enhanceIO (not in standard kernel yet), which caches your device

I took a look at that too, but I'd really prefer something in the mainline
kernel for production deployment.

Thanks much.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-05  9:35 ` Zdenek Kabelac
@ 2014-02-05 20:20   ` Paul B. Henson
  2014-02-05 21:29     ` Zdenek Kabelac
  0 siblings, 1 reply; 18+ messages in thread
From: Paul B. Henson @ 2014-02-05 20:20 UTC (permalink / raw)
  To: lvm-devel

> From: Zdenek Kabelac
> Sent: Wednesday, February 05, 2014 1:35 AM
>
> Well - there is work in progress in upstream git - but it's highly
> 'experimental' and its user-space API can change any minute - so it's only
> useful for playing - but not for any real use yet.

Agreed :). I'm just looking to try and make sure that as it stabilizes my
desired use case fits in the picture ;).

> Side note - lvm2 now supports it's own metadata format for md raid1 - this
> should allow better handling of device stack (it's using same kernel
driver as
> mdraid) - use just a single command to active everything in proper order.

I've read somewhat about the integration of mdraid and lvm, but not enough
to fully understand it or be comfortable about switching from classic mdraid
to lvm integrated mdraid.

> Current version of dm-cache supports  only  1:1 mapping - so one large
cache
> shared by multiple LVs is not supported. You will need to prepare smaller
> individual cache pools for each of your LV.

I'm not sure what you mean here; I confirmed on the device mapper mailing
list that using dm-cache directly would support my desired stacking of
placing a PV on top of a dm-cache device that is sitting on top of a raw SSD
raid1 md cache device and a raw HD raid10 origin device, effectively using
the single cache device to cache all of the LV's created on the PV. I don't
really want to split up the cache device into bits and pieces for each
individual LV, that doesn't seem very efficient; I'd rather have the entire
cache device available for which ever LV's happen to be hot at a given time.

So it's really just a question of whether or not lvm is going to support a
user-friendly layer on top of dm-cache for this type of stacking, or if
somebody will be stuck using dm-cache directly if they want to implement
something like this.

Thanks.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-05 20:20   ` Paul B. Henson
@ 2014-02-05 21:29     ` Zdenek Kabelac
  2014-02-06  1:36       ` Paul B. Henson
  0 siblings, 1 reply; 18+ messages in thread
From: Zdenek Kabelac @ 2014-02-05 21:29 UTC (permalink / raw)
  To: lvm-devel

Dne 5.2.2014 21:20, Paul B. Henson napsal(a):
>> From: Zdenek Kabelac
>> Sent: Wednesday, February 05, 2014 1:35 AM
> I've read somewhat about the integration of mdraid and lvm, but not enough
> to fully understand it or be comfortable about switching from classic mdraid
> to lvm integrated mdraid.

Well if you miss feature from  mdadm you may request some enhacements.
It should be giving you more options - since the LV doens't need to be across 
whole PV - i.e. you could have 4 disks in VG -  and you could build some LV in 
raid0/stripe,  other in raid1, and also some LV could be in raid5.


>> Current version of dm-cache supports  only  1:1 mapping - so one large
> cache
>> shared by multiple LVs is not supported. You will need to prepare smaller
>> individual cache pools for each of your LV.
>
> I'm not sure what you mean here; I confirmed on the device mapper mailing
> list that using dm-cache directly would support my desired stacking of
> placing a PV on top of a dm-cache device that is sitting on top of a raw SSD
> raid1 md cache device and a raw HD raid10 origin device, effectively using
> the single cache device to cache all of the LV's created on the PV. I don't
> really want to split up the cache device into bits and pieces for each
> individual LV, that doesn't seem very efficient; I'd rather have the entire
> cache device available for which ever LV's happen to be hot at a given time.
>
> So it's really just a question of whether or not lvm is going to support a
> user-friendly layer on top of dm-cache for this type of stacking, or if
> somebody will be stuck using dm-cache directly if they want to implement
> something like this.


lvm2 is not supporting caching of PVs - that's the layer below the lvm2. Your 
proposed idea would be hard to efficiently implement.

lvm2 would have to create some 'virtual' huge device combined from all PVs in 
VG  (and with special handling for segments like mirrors/raids) - this would 
be then always used as a cache for any LV activated from this virtual layer - 
with lost of troubles during activation.

With per LV granularity you get the option to chose different policy for each LV.

Note - it should be possible to create  cached  thin pool data LV - and then 
you get all thin volumes  cached through data device.

We may consider the option to use a single cache pool for multiple single 
linear LVs - since in this case we might be able to resolve tricky virtual 
mapping.

Zdenek



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-05 21:29     ` Zdenek Kabelac
@ 2014-02-06  1:36       ` Paul B. Henson
  2014-02-11 17:24         ` Brassow Jonathan
  0 siblings, 1 reply; 18+ messages in thread
From: Paul B. Henson @ 2014-02-06  1:36 UTC (permalink / raw)
  To: lvm-devel

> From: Zdenek Kabelac
> Sent: Wednesday, February 05, 2014 1:30 PM
>
> lvm2 is not supporting caching of PVs - that's the layer below the lvm2.
Your
> proposed idea would be hard to efficiently implement.

Hmm, yes, I see what you mean. I guess to go that way I'd have to use
dm-cache directly or reconsider bcache to create a cached device to feed to
lvm as a PV.

> We may consider the option to use a single cache pool for multiple single
> linear LVs - since in this case we might be able to resolve tricky virtual
> mapping.

It would probably work out for my use case if the PV itself wasn't cached
but the entire SSD cache device could be shared amongst all of the LV's. I
really don't want to split the cache into separate pieces per LV though,
that's just not going to scale.

Thanks.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-06  1:36       ` Paul B. Henson
@ 2014-02-11 17:24         ` Brassow Jonathan
  2014-02-11 21:04           ` Paul B. Henson
  0 siblings, 1 reply; 18+ messages in thread
From: Brassow Jonathan @ 2014-02-11 17:24 UTC (permalink / raw)
  To: lvm-devel


On Feb 5, 2014, at 7:36 PM, Paul B. Henson wrote:

>> 
>> We may consider the option to use a single cache pool for multiple single
>> linear LVs - since in this case we might be able to resolve tricky virtual
>> mapping.
> 
> It would probably work out for my use case if the PV itself wasn't cached
> but the entire SSD cache device could be shared amongst all of the LV's. I
> really don't want to split the cache into separate pieces per LV though,
> that's just not going to scale.
> 
> Thanks.

What do you think about caching a thin pool?  Then you would get the benefit of caching for all of your LVs and you would get all the benefits of thin and thin snapshots.

We are still working on getting all the cache pieces in place in LVM.  Thanks for watching.
 brassow



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-11 17:24         ` Brassow Jonathan
@ 2014-02-11 21:04           ` Paul B. Henson
  2014-02-12 17:39             ` Brassow Jonathan
  0 siblings, 1 reply; 18+ messages in thread
From: Paul B. Henson @ 2014-02-11 21:04 UTC (permalink / raw)
  To: lvm-devel

> From: Brassow Jonathan
> Sent: Tuesday, February 11, 2014 9:24 AM
>
> What do you think about caching a thin pool?  Then you would get the
> benefit of caching for all of your LVs and you would get all the benefits
of
> thin and thin snapshots.

Hmm, I'm not that familiar with thin allocation in lvm. Basically, I would
allocate the entire PV as a thin LV, attach my entire cache to that, and
then create all of my production LV's with thin allocation out of the single
thin LV? Interesting thought. I have sufficient storage for at least the
midterm, my perhaps flawed understanding is that going with thin LV's
introduces an extra layer of complexity and inefficiency versus regular
allocation?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-11 21:04           ` Paul B. Henson
@ 2014-02-12 17:39             ` Brassow Jonathan
  2014-02-17 22:12               ` Paul B. Henson
  2014-03-11 23:54               ` Paul B. Henson
  0 siblings, 2 replies; 18+ messages in thread
From: Brassow Jonathan @ 2014-02-12 17:39 UTC (permalink / raw)
  To: lvm-devel


On Feb 11, 2014, at 3:04 PM, Paul B. Henson wrote:

>> From: Brassow Jonathan
>> Sent: Tuesday, February 11, 2014 9:24 AM
>> 
>> What do you think about caching a thin pool?  Then you would get the
>> benefit of caching for all of your LVs and you would get all the benefits
> of
>> thin and thin snapshots.
> 
> Hmm, I'm not that familiar with thin allocation in lvm. Basically, I would
> allocate the entire PV as a thin LV, attach my entire cache to that, and
> then create all of my production LV's with thin allocation out of the single
> thin LV? Interesting thought. I have sufficient storage for at least the
> midterm, my perhaps flawed understanding is that going with thin LV's
> introduces an extra layer of complexity and inefficiency versus regular
> allocation?

I'll let others weigh-in on the overhead of introducing thin-provisioning.  However, my experience is that the overhead is very small and the benefit is very large.

I imagine the steps being something like this:
# Create your VG with slow and fast block devices
1~> vgcreate vg /dev/slow_sd[abcdef]1 /dev/fast_sd[abcde]1

# Create data portion of your thin pool using all your slow devices
2~> lvcreate -l <all slow dev extents> -n thinpool vg /dev/slow_sd[abcdef]1

# Create the metadata portion of your thin pool
#  use fast devs to keep overhead down
#  use raid1 to provide redunancy
3~> lvcreate --type raid1 -l <1/1000th size of data LV> -n thinpool_metadata vg /dev/fast_sd[ab]1

# Create cache pool LV that will be used to cache the data portion of the thin pool
#  You can use 'writethrough' mode to speed up reads, but still have writes hit the slow dev
#  or you can create the data&metadata areas of the cachepool separate using raid and then
#  convert those into a cache pool... lots of options here to improve redundancy.  I'm working
#  on man page changes/additions to make this clear and simple.  For now, we'll just create
#  a cachepool simple LV.
4~> lvcreate --type cache_pool -L <desired cache size> -n cachepool vg /dev/fast_sd[abcdef]1

# Use the cache pool to create a cached LV of the thin pool data device
5~> lvconvert --type cache --cachepool vg/cachepool vg/thinpool

# The data portion of your 'thinpool' LV is now cached.
# Make the thinpool using the cached data LV 'thinpool' and the fast metadata LV 'thinpool_metadata'
6~> lvconvert --thinpool vg/thinpool --poolmetadata thinpool_metadata

You now have a very fast thin pool from which you will create thin volumes and snapshots.  Everything is cached with low overhead.

7~> lvcreate -T vg/thinpool -n my_lv1 -V 200G
8~> lvcreate -T vg/thinpool -n my_lv2 -V 200G
...

 brassow

N.B.  Note that you must specify '--with-cache=internal' when configuring LVM - cache is off by default until it is no longer considered experimental.  You should be using dm-cache kernel module version 1.3.0+.  Support for 'lvconvert --type cache[-pool]' was only just added this morning.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-12 17:39             ` Brassow Jonathan
@ 2014-02-17 22:12               ` Paul B. Henson
  2014-03-11 23:54               ` Paul B. Henson
  1 sibling, 0 replies; 18+ messages in thread
From: Paul B. Henson @ 2014-02-17 22:12 UTC (permalink / raw)
  To: lvm-devel

> From: Brassow Jonathan
> Sent: Wednesday, February 12, 2014 9:40 AM
> 
> I'll let others weigh-in on the overhead of introducing thin-provisioning.
> However, my experience is that the overhead is very small and the benefit
is
> very large.
[...]
> You now have a very fast thin pool from which you will create thin volumes
> and snapshots.  Everything is cached with low overhead.

Hmm, interesting; this would allow me to work within the confines of the
intended lvm cache support yet achieve my goal of using a single cache
device for all of my LV's. That does seem preferable to trying to manage
dm-cache manually with dmsetup.

Thanks.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-02-12 17:39             ` Brassow Jonathan
  2014-02-17 22:12               ` Paul B. Henson
@ 2014-03-11 23:54               ` Paul B. Henson
  2014-03-17 15:56                 ` Brassow Jonathan
  1 sibling, 1 reply; 18+ messages in thread
From: Paul B. Henson @ 2014-03-11 23:54 UTC (permalink / raw)
  To: lvm-devel

> From: Brassow Jonathan
> Sent: Wednesday, February 12, 2014 9:40 AM
>
> I'll let others weigh-in on the overhead of introducing thin-provisioning.
> However, my experience is that the overhead is very small and the benefit
is
> very large.

So I think I'd like to try out your recommendation of using thin
provisioning to allow dm-cache to cache all of my lv's, and was wondering if
you guys had any rough idea on when you might release a version of  lvm2
with support for cache devices? The box in question is pseudo-production,
and while I think I'm willing to risk a freshly released new feature, I
don't think I want to go quite so far as to run git head on it ;).

The intention is to allow insertion of a cache device live with no
disruption in service, right? So theoretically I could get the thin
provisioned pool all set up now and start using it, and then when the
version with cache support is released, transparently slip in the cache
device?

Is there a recommended kernel version for thin provisioning? Right now I'm
running 3.12, but I thought I saw a bug recently fly by involving thinpool
metadata corruption that's going to be fixed in 3.14, would it be better to
wait for a stable release of 3.14?

>From reading mailing list archives, if your metadata volume runs out of
space, your entire thin pool is corrupted? And historically you were unable
to resize or extend your metadata volume? I see a number of mentions of that
ability coming soon, but didn't see anything actually announcing it was
available. At this point, is the size of the metadata volume still fixed as
of initial creation? 

The intended size of my thinpool is going to be about 3.64TB:

/dev/md3   vg_vz lvm2 a--    3.64t  3.61t

Based on the recommendation below of 1/1000 of that for metadata, that would
be about 3.75GB. This pool is going to have *lots* of snapshots, there are
going to be a set of filesystems for a template vm, each of which will be
snapshot'd/cloned when a new vm is created, and then all of those will have
snapshots for backups. Given that, would 3.75GB for the metadata volume be
sufficient, or would it be better to crank it up a little?

Given /dev/md3 is my "slow" device (raid10 of 4 x 2TB), and /dev/md2 is my
"fast" device (raid1 of 2 x 256G SSD), plugging into your example gives me:

# vgcreate vg_vz /dev/md3 /dev/md2
# lvcreate -l 953800 -n thinpool vg_vz /dev/md3
# lvcreate -L 3.75G -n thinpool_metadata /dev/md2
# lvconvert --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata

At this point, I would have a thinpool ready to use, and can work with it
until a version of lvm2 is released with cache support, at which point I
could run:

# lvcreate --type cache_pool -l 100%PVS  -n cachepool vg /dev/md2
# lvconvert --type cache --cachepool vg/cachepool vg/thinpool

To transparently add the cache device to my existing thinpool?

Thanks much.

> I imagine the steps being something like this:
> # Create your VG with slow and fast block devices
> 1~> vgcreate vg /dev/slow_sd[abcdef]1 /dev/fast_sd[abcde]1
> 
> # Create data portion of your thin pool using all your slow devices
> 2~> lvcreate -l <all slow dev extents> -n thinpool vg
/dev/slow_sd[abcdef]1
> 
> # Create the metadata portion of your thin pool
> #  use fast devs to keep overhead down
> #  use raid1 to provide redunancy
> 3~> lvcreate --type raid1 -l <1/1000th size of data LV> -n
thinpool_metadata
> vg /dev/fast_sd[ab]1
> 
> # Create cache pool LV that will be used to cache the data portion of the
thin
> pool
> #  You can use 'writethrough' mode to speed up reads, but still have
writes
> hit the slow dev
> #  or you can create the data&metadata areas of the cachepool separate
> using raid and then
> #  convert those into a cache pool... lots of options here to improve
> redundancy.  I'm working
> #  on man page changes/additions to make this clear and simple.  For now,
> we'll just create
> #  a cachepool simple LV.
> 4~> lvcreate --type cache_pool -L <desired cache size> -n cachepool vg
> /dev/fast_sd[abcdef]1
> 
> # Use the cache pool to create a cached LV of the thin pool data device
> 5~> lvconvert --type cache --cachepool vg/cachepool vg/thinpool
> 
> # The data portion of your 'thinpool' LV is now cached.
> # Make the thinpool using the cached data LV 'thinpool' and the fast
> metadata LV 'thinpool_metadata'
> 6~> lvconvert --thinpool vg/thinpool --poolmetadata thinpool_metadata
> 
> You now have a very fast thin pool from which you will create thin volumes
> and snapshots.  Everything is cached with low overhead.
> 
> 7~> lvcreate -T vg/thinpool -n my_lv1 -V 200G
> 8~> lvcreate -T vg/thinpool -n my_lv2 -V 200G
> ...
> 
>  brassow
> 
> N.B.  Note that you must specify '--with-cache=internal' when configuring
> LVM - cache is off by default until it is no longer considered
experimental.
> You should be using dm-cache kernel module version 1.3.0+.  Support for
> 'lvconvert --type cache[-pool]' was only just added this morning.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-03-11 23:54               ` Paul B. Henson
@ 2014-03-17 15:56                 ` Brassow Jonathan
  2014-03-18  1:07                   ` Paul B. Henson
  2014-03-29  1:38                   ` Paul B. Henson
  0 siblings, 2 replies; 18+ messages in thread
From: Brassow Jonathan @ 2014-03-17 15:56 UTC (permalink / raw)
  To: lvm-devel


On Mar 11, 2014, at 6:54 PM, Paul B. Henson wrote:

>> From: Brassow Jonathan
>> Sent: Wednesday, February 12, 2014 9:40 AM
>> 
>> I'll let others weigh-in on the overhead of introducing thin-provisioning.
>> However, my experience is that the overhead is very small and the benefit
> is
>> very large.
> 
> So I think I'd like to try out your recommendation of using thin
> provisioning to allow dm-cache to cache all of my lv's, and was wondering if
> you guys had any rough idea on when you might release a version of  lvm2
> with support for cache devices? The box in question is pseudo-production,
> and while I think I'm willing to risk a freshly released new feature, I
> don't think I want to go quite so far as to run git head on it ;).
> 
> The intention is to allow insertion of a cache device live with no
> disruption in service, right? So theoretically I could get the thin
> provisioned pool all set up now and start using it, and then when the
> version with cache support is released, transparently slip in the cache
> device?
> 
> Is there a recommended kernel version for thin provisioning? Right now I'm
> running 3.12, but I thought I saw a bug recently fly by involving thinpool
> metadata corruption that's going to be fixed in 3.14, would it be better to
> wait for a stable release of 3.14?
> 
>> From reading mailing list archives, if your metadata volume runs out of
> space, your entire thin pool is corrupted? And historically you were unable
> to resize or extend your metadata volume? I see a number of mentions of that
> ability coming soon, but didn't see anything actually announcing it was
> available. At this point, is the size of the metadata volume still fixed as
> of initial creation? 
> 
> The intended size of my thinpool is going to be about 3.64TB:
> 
> /dev/md3   vg_vz lvm2 a--    3.64t  3.61t
> 
> Based on the recommendation below of 1/1000 of that for metadata, that would
> be about 3.75GB. This pool is going to have *lots* of snapshots, there are
> going to be a set of filesystems for a template vm, each of which will be
> snapshot'd/cloned when a new vm is created, and then all of those will have
> snapshots for backups. Given that, would 3.75GB for the metadata volume be
> sufficient, or would it be better to crank it up a little?
> 
> Given /dev/md3 is my "slow" device (raid10 of 4 x 2TB), and /dev/md2 is my
> "fast" device (raid1 of 2 x 256G SSD), plugging into your example gives me:
> 
> # vgcreate vg_vz /dev/md3 /dev/md2
> # lvcreate -l 953800 -n thinpool vg_vz /dev/md3
> # lvcreate -L 3.75G -n thinpool_metadata /dev/md2
> # lvconvert --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata
> 
> At this point, I would have a thinpool ready to use, and can work with it
> until a version of lvm2 is released with cache support, at which point I
> could run:
> 
> # lvcreate --type cache_pool -l 100%PVS  -n cachepool vg /dev/md2
> # lvconvert --type cache --cachepool vg/cachepool vg/thinpool
> 
> To transparently add the cache device to my existing thinpool?

Yes, that is basically the idea.  However, converting the thinpool is a little more tricky.  You already are using the fast device for the thinpool metadata device (which seems awfully large from your example).  When you cache the thinpool, I think you just want to cache the data section.
# lvconvert --type cache --cachepool vg/cachepool vg/thinpool_tdata

 brassow





^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-03-17 15:56                 ` Brassow Jonathan
@ 2014-03-18  1:07                   ` Paul B. Henson
  2014-03-29  1:38                   ` Paul B. Henson
  1 sibling, 0 replies; 18+ messages in thread
From: Paul B. Henson @ 2014-03-18  1:07 UTC (permalink / raw)
  To: lvm-devel

> From: Brassow Jonathan
> Sent: Monday, March 17, 2014 8:56 AM
>
> Yes, that is basically the idea.  However, converting the thinpool is a
little
> more tricky.  You already are using the fast device for the thinpool
metadata
> device (which seems awfully large from your example).  When you cache the
> thinpool, I think you just want to cache the data section.
> # lvconvert --type cache --cachepool vg/cachepool vg/thinpool_tdata

Ah right; in your original example you had connected the cache to the
thinpool data lv before creating the actual thinpool, when I tried to
convert that to adding the cache after the fact I didn't take that into
consideration.

In your original example, it said to size the metadata device to
approximately 1/1000 of the data lv size, so for a 4 TB data lv that would
actually be almost 4 GB? The estimation utility says about 2.4 GB:

# thin_metadata_size -b 64k -s 4t -m 100000 -u g
thin_metadata_size - 2.41 gigabytes estimated metadata area size

Evidently online metadata pool resize is supported now, so I guess I don't
have to extremely overestimate from the start, as long as I keep an eye on
it I can always top it off before it runs out.

Thanks.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-03-17 15:56                 ` Brassow Jonathan
  2014-03-18  1:07                   ` Paul B. Henson
@ 2014-03-29  1:38                   ` Paul B. Henson
  2014-03-29  1:43                     ` Paul B. Henson
  1 sibling, 1 reply; 18+ messages in thread
From: Paul B. Henson @ 2014-03-29  1:38 UTC (permalink / raw)
  To: lvm-devel

On Mon, Mar 17, 2014 at 10:56:17AM -0500, Brassow Jonathan wrote:

> > Given /dev/md3 is my "slow" device (raid10 of 4 x 2TB), and /dev/md2 is my
> > "fast" device (raid1 of 2 x 256G SSD), plugging into your example gives me:
> > 
> > # vgcreate vg_vz /dev/md3 /dev/md2
> > # lvcreate -l 953800 -n thinpool vg_vz /dev/md3
> > # lvcreate -L 3.75G -n thinpool_metadata /dev/md2
> > # lvconvert --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata

So I updated to the latest 3.13.7 kernel and decided to try and spin up my
thin pool. The first couple steps went well:

# lvcreate -l 945908 -n thinpool vg_vz /dev/md3
  Logical volume "thinpool" created

# lvcreate -L 2.5G -n thinpool_metadata vg_vz /dev/md2
  Logical volume "thinpool_metadata" created

But lvconvert seems to have gone sideways:

# lvconvert --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata
  device-mapper: remove ioctl on  failed: Device or resource busy
  Logical volume "lvol0" created
  device-mapper: reload ioctl on  failed: Invalid argument
  Failed to activate pool logical volume vg_vz/thinpool.

The thinpool seems to sort of be there, but it's not active, and lvs gives
an error I assume is related to it:

# lvs vg_vz
  dm_report_object: report function failed for field data_percent
  LV              VG    Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  thinpool        vg_vz twi---tz--   3.61t

I can't seem to activate it:

# vgchange -a y vg_vz
  device-mapper: reload ioctl on  failed: Invalid argument
  16 logical volume(s) in volume group "vg_vz" now active

I finally ended up trying to delete it:

# lvremove vg_vz/thinpool
Do you really want to remove and DISCARD logical volume thinpool? [y/n]: y
  Logical volume "thinpool" successfully removed

And that seems to have cleaned everything up.

The ioctl complaints seem like they might be related to:

	https://bugzilla.redhat.com/show_bug.cgi?id=927437#c10

But I didn't see anything about failing to activate the pool?

I tried it again, this time it said:

# lvconvert  --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata
  device-mapper: remove ioctl on  failed: Device or resource busy
  Logical volume "lvol0" created
  device-mapper: create ioctl on vg_vz-thinpool_tmeta failed: Device or resource busy
  Failed to activate pool logical volume vg_vz/thinpool.

lvs doesn't give an error now, but any attempt to make a thin volume fails
with the same error lvconvert gave:

# lvcreate -T vg_vz/thinpool -V 10G -n testlv
  device-mapper: create ioctl on vg_vz-thinpool_tmeta failed: Device or resource busy

Am I doing something wrong here?

Also, what's the lvol0_pmspare volume that seems to show up? It's the exact
same size as my metadata lv?

# lvs --all vg_vz
  LV               VG    Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  [lvol0_pmspare]  vg_vz ewi-------   2.50g
  thinpool         vg_vz twi---tz--   3.61t
  [thinpool_tdata] vg_vz Twi-------   3.61t
  [thinpool_tmeta] vg_vz ewi-------   2.50g

Thanks...



^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-03-29  1:38                   ` Paul B. Henson
@ 2014-03-29  1:43                     ` Paul B. Henson
  2014-04-02 20:34                       ` Brassow Jonathan
  0 siblings, 1 reply; 18+ messages in thread
From: Paul B. Henson @ 2014-03-29  1:43 UTC (permalink / raw)
  To: lvm-devel

On Fri, Mar 28, 2014 at 06:38:41PM -0700, Paul B. Henson wrote:

> So I updated to the latest 3.13.7 kernel and decided to try and spin up my
> thin pool. The first couple steps went well:
> 
> # lvcreate -l 945908 -n thinpool vg_vz /dev/md3
>   Logical volume "thinpool" created
> 
> # lvcreate -L 2.5G -n thinpool_metadata vg_vz /dev/md2
>   Logical volume "thinpool_metadata" created
> 
> But lvconvert seems to have gone sideways:
> 
> # lvconvert --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata
>   device-mapper: remove ioctl on  failed: Device or resource busy
>   Logical volume "lvol0" created
>   device-mapper: reload ioctl on  failed: Invalid argument
>   Failed to activate pool logical volume vg_vz/thinpool.

I also just noticed the kernel logged some complaints too when I tried to run
the lvconvert:

Mar 28 17:59:03 virtz kernel: [82813.713227] device-mapper: table: 253:26: thin-pool: Invalid block size
Mar 28 17:59:03 virtz kernel: [82813.713232] device-mapper: ioctl: error adding target to table


Mar 28 18:11:38 virtz kernel: [83568.730506] device-mapper: table: 253:26: thin-pool: Invalid block size
Mar 28 18:11:38 virtz kernel: [83568.730510] device-mapper: ioctl: error adding target to table





^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-03-29  1:43                     ` Paul B. Henson
@ 2014-04-02 20:34                       ` Brassow Jonathan
  2014-04-03  2:54                         ` Paul B. Henson
  0 siblings, 1 reply; 18+ messages in thread
From: Brassow Jonathan @ 2014-04-02 20:34 UTC (permalink / raw)
  To: lvm-devel


On Mar 28, 2014, at 8:43 PM, Paul B. Henson wrote:

> On Fri, Mar 28, 2014 at 06:38:41PM -0700, Paul B. Henson wrote:
> 
>> So I updated to the latest 3.13.7 kernel and decided to try and spin up my
>> thin pool. The first couple steps went well:
>> 
>> # lvcreate -l 945908 -n thinpool vg_vz /dev/md3
>>  Logical volume "thinpool" created
>> 
>> # lvcreate -L 2.5G -n thinpool_metadata vg_vz /dev/md2
>>  Logical volume "thinpool_metadata" created
>> 
>> But lvconvert seems to have gone sideways:
>> 
>> # lvconvert --thinpool vg_vz/thinpool --poolmetadata thinpool_metadata
>>  device-mapper: remove ioctl on  failed: Device or resource busy
>>  Logical volume "lvol0" created
>>  device-mapper: reload ioctl on  failed: Invalid argument
>>  Failed to activate pool logical volume vg_vz/thinpool.
> 
> I also just noticed the kernel logged some complaints too when I tried to run
> the lvconvert:
> 
> Mar 28 17:59:03 virtz kernel: [82813.713227] device-mapper: table: 253:26: thin-pool: Invalid block size
> Mar 28 17:59:03 virtz kernel: [82813.713232] device-mapper: ioctl: error adding target to table
> 
> 
> Mar 28 18:11:38 virtz kernel: [83568.730506] device-mapper: table: 253:26: thin-pool: Invalid block size
> Mar 28 18:11:38 virtz kernel: [83568.730510] device-mapper: ioctl: error adding target to table

What version of LVM are you using?

 brassow




^ permalink raw reply	[flat|nested] 18+ messages in thread

* cache support
  2014-04-02 20:34                       ` Brassow Jonathan
@ 2014-04-03  2:54                         ` Paul B. Henson
  0 siblings, 0 replies; 18+ messages in thread
From: Paul B. Henson @ 2014-04-03  2:54 UTC (permalink / raw)
  To: lvm-devel

On Wed, Apr 02, 2014 at 03:34:46PM -0500, Brassow Jonathan wrote:
> 
> What version of LVM are you using?

2.02.103 at the moment, it's what's currently marked stable in Gentoo
linux. I also noticed these in the logs, even after I deleted the failed
lv:

Mar 28 18:45:08 virtz lvm[13640]: Invalid target type.
Mar 28 18:45:18 virtz lvm[13640]: Invalid target type.
Mar 28 18:45:28 virtz lvm[13640]: Invalid target type.

They went away after I rebooted.

Thanks...



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-04-03  2:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-05  1:51 cache support Paul B. Henson
2014-02-05  7:35 ` Oliver Rath
2014-02-05 20:12   ` Paul B. Henson
2014-02-05  9:35 ` Zdenek Kabelac
2014-02-05 20:20   ` Paul B. Henson
2014-02-05 21:29     ` Zdenek Kabelac
2014-02-06  1:36       ` Paul B. Henson
2014-02-11 17:24         ` Brassow Jonathan
2014-02-11 21:04           ` Paul B. Henson
2014-02-12 17:39             ` Brassow Jonathan
2014-02-17 22:12               ` Paul B. Henson
2014-03-11 23:54               ` Paul B. Henson
2014-03-17 15:56                 ` Brassow Jonathan
2014-03-18  1:07                   ` Paul B. Henson
2014-03-29  1:38                   ` Paul B. Henson
2014-03-29  1:43                     ` Paul B. Henson
2014-04-02 20:34                       ` Brassow Jonathan
2014-04-03  2:54                         ` Paul B. Henson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.