linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
@ 2017-12-28 10:42 Eric Ren
  2018-01-02  8:09 ` Eric Ren
  2018-01-02 17:10 ` David Teigland
  0 siblings, 2 replies; 11+ messages in thread
From: Eric Ren @ 2017-12-28 10:42 UTC (permalink / raw)
  To: David Teigland; +Cc: LVM general discussion and development

Hi David,

I see there is a limitation on lvesizing the LV active on multiple node. 
From `man lvmlockd`:

"""
limitations of lockd VGs
...
* resizing an LV that is active in the shared mode on multiple hosts
"""

It seems a big limitation to use lvmlockd in cluster:

"""
c1-n1:~ # lvresize -L-1G vg1/lv1
   WARNING: Reducing active logical volume to 1.00 GiB.
   THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce vg1/lv1? [y/n]: y
   LV is already locked with incompatible mode: vg1/lv1
"""

Node "c1-n1" is the last node having vg1/lv1 active on it.
Can we change the lock mode from "shared" to "exclusive" to
lvresize without having to deactivate the LV on the last node?

It will reduce the availability if we have to deactivate LV on all
nodes to resize. Is there plan to eliminate this limitation in the
near future?

Regards,
Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2017-12-28 10:42 [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes Eric Ren
@ 2018-01-02  8:09 ` Eric Ren
  2018-01-02 17:10 ` David Teigland
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Ren @ 2018-01-02  8:09 UTC (permalink / raw)
  To: David Teigland; +Cc: LVM general discussion and development

Hi David,

I see the comments on res_process():

"""
/*
  * Go through queued actions, and make lock/unlock calls on the resource
  * based on the actions and the existing lock state.
  *
  * All lock operations sent to the lock manager are non-blocking.
  * This is because sanlock does not support lock queueing.
  * Eventually we could enhance this to take advantage of lock
  * queueing when available (i.e. for the dlm).
"""

Is it the reason why lvmlockd has limitation on lvresize with "sh" lock
because lvmlockd cannot up convert "sh" to "ex" to perform lvresize command?

Regards,
Eric

On 12/28/2017 06:42 PM, Eric Ren wrote:
> Hi David,
>
> I see there is a limitation on lvesizing the LV active on multiple node.
>> From `man lvmlockd`:
>
> """
> limitations of lockd VGs
> ...
> * resizing an LV that is active in the shared mode on multiple hosts
> """
>
> It seems a big limitation to use lvmlockd in cluster:
>
> """
> c1-n1:~ # lvresize -L-1G vg1/lv1
>   WARNING: Reducing active logical volume to 1.00 GiB.
>   THIS MAY DESTROY YOUR DATA (filesystem etc.)
> Do you really want to reduce vg1/lv1? [y/n]: y
>   LV is already locked with incompatible mode: vg1/lv1
> """
>
> Node "c1-n1" is the last node having vg1/lv1 active on it.
> Can we change the lock mode from "shared" to "exclusive" to
> lvresize without having to deactivate the LV on the last node?
>
> It will reduce the availability if we have to deactivate LV on all
> nodes to resize. Is there plan to eliminate this limitation in the
> near future?
>
> Regards,
> Eric
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2017-12-28 10:42 [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes Eric Ren
  2018-01-02  8:09 ` Eric Ren
@ 2018-01-02 17:10 ` David Teigland
  2018-01-03  3:52   ` Eric Ren
  1 sibling, 1 reply; 11+ messages in thread
From: David Teigland @ 2018-01-02 17:10 UTC (permalink / raw)
  To: Eric Ren; +Cc: LVM general discussion and development

> * resizing an LV that is active in the shared mode on multiple hosts
> 
> It seems a big limitation to use lvmlockd in cluster:

Only in the case where the LV is active on multiple hosts at once,
i.e. a cluster fs, which is less common than a local fs.

In the general case, it's not safe to assume that an LV can be modified by
one node while it's being used by others, even when all of them hold
shared locks on the LV.  You'd want to prevent that in general.
Exceptions exist, but whether an exception is ok will likely depend on
what the specific change is, what application is using the LV, whether
that application can tolerate such a change.

One (perhaps the only?) valid exception I know about is extending an LV
while it's being used under a cluster fs (any cluster fs?)

(In reference to your later email, this is not related to lock queueing,
but rather to basic ex/sh lock incompatibility, and when/how to allow
exceptions to that.)

The simplest approach I can think of to allow lvextend under a cluster fs
would be a procedure like:

1. one one node: lvextend --lockopt skip -L+1G VG/LV

   That option doesn't exist, but illustrates the point that some new
   option could be used to skip the incompatible LV locking in lvmlockd.

2. on each node: lvchange --refresh VG/LV

   This updates dm on each node with the new device size.

3. gfs2_grow VG/LV or equivalent

   At this point the fs on any node can begin accessing the new space.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-02 17:10 ` David Teigland
@ 2018-01-03  3:52   ` Eric Ren
  2018-01-03 15:07     ` David Teigland
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Ren @ 2018-01-03  3:52 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland

Hello David,

Happy new year!

On 01/03/2018 01:10 AM, David Teigland wrote:
>> * resizing an LV that is active in the shared mode on multiple hosts
>>
>> It seems a big limitation to use lvmlockd in cluster:
> Only in the case where the LV is active on multiple hosts at once,
> i.e. a cluster fs, which is less common than a local fs.
>
> In the general case, it's not safe to assume that an LV can be modified by
> one node while it's being used by others, even when all of them hold
> shared locks on the LV.  You'd want to prevent that in general.
> Exceptions exist, but whether an exception is ok will likely depend on
> what the specific change is, what application is using the LV, whether
> that application can tolerate such a change.
>
> One (perhaps the only?) valid exception I know about is extending an LV
> while it's being used under a cluster fs (any cluster fs?)

The only concrete scenario I can think of is also cluster fs, like OCFS2,
tunefs.ocfs2 can enlarge the FS to use all the device space online.

> (In reference to your later email, this is not related to lock queueing,
> but rather to basic ex/sh lock incompatibility, and when/how to allow
> exceptions to that.)
I thought the procedures to allow lvresize is like below if the LV is 
used by cluster FS:

Assume the LV is active with "sh" lock on multiple nodes (node1 and node2),
and we  lvextend on node1:

- node1:  the "sh" lock on r1 (the LV resource) needs to up convert: 
"sh" -> "ex";
- node2: on behalf of the BAST, the "sh" lock on r1needs to down 
convert: "sh" -> "nl",
   which means the LV should be suspended;
- node1: on receiving AST (get "ex" lock), lvresize is allowed;

After the completion of lvresize,  the original lock state should be 
restored on every node,
meanwhile the latest metadata can be refreshed, maybe like below:

- node1: restore the original lock mode, "ex" -> "sh", the metadata 
version will be increased,
   so that request to update metadata can be sent to other nodes;
- node2: on receiving request, "nl" -> "sh", then to refresh the 
metadata from disk;

>
> The simplest approach I can think of to allow lvextend under a cluster fs
> would be a procedure like:

If there is a simple approach, I think it maybe worth a try.

>
> 1. one one node: lvextend --lockopt skip -L+1G VG/LV
>
>     That option doesn't exist, but illustrates the point that some new
>     option could be used to skip the incompatible LV locking in lvmlockd.

Hmm, is it safe to just skip the locking while the LV is active on other 
node?
Is there somewhere in the code to avoid concurrent lvm command to execute
at the same time?

>
> 2. on each node: lvchange --refresh VG/LV
>
>     This updates dm on each node with the new device size.
>
> 3. gfs2_grow VG/LV or equivalent
>
>     At this point the fs on any node can begin accessing the new space.
It would be great.

Regards,
Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-03  3:52   ` Eric Ren
@ 2018-01-03 15:07     ` David Teigland
  2018-01-04  9:06       ` Eric Ren
  0 siblings, 1 reply; 11+ messages in thread
From: David Teigland @ 2018-01-03 15:07 UTC (permalink / raw)
  To: Eric Ren, linux-lvm

On Wed, Jan 03, 2018 at 11:52:34AM +0800, Eric Ren wrote:
> > 1. one one node: lvextend --lockopt skip -L+1G VG/LV
> > 
> >     That option doesn't exist, but illustrates the point that some new
> >     option could be used to skip the incompatible LV locking in lvmlockd.
> 
> Hmm, is it safe to just skip the locking while the LV is active on other
> node?
> Is there somewhere in the code to avoid concurrent lvm command to execute
> at the same time?

The VG lock is still used to protect the VG metadata change.  The LV lock
doesn't protect anything per se, it just represents that lvchange has
activated the LV on this host.  (The LV lock does not represent the
suspended/resumed state of the dm device either, as you suggested above.)

I'll send a simple patch to skip the lv lock to try this.
Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-03 15:07     ` David Teigland
@ 2018-01-04  9:06       ` Eric Ren
  2018-01-09  2:42         ` Eric Ren
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Ren @ 2018-01-04  9:06 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland

David,

On 01/03/2018 11:07 PM, David Teigland wrote:
> On Wed, Jan 03, 2018 at 11:52:34AM +0800, Eric Ren wrote:
>>> 1. one one node: lvextend --lockopt skip -L+1G VG/LV
>>>
>>>      That option doesn't exist, but illustrates the point that some new
>>>      option could be used to skip the incompatible LV locking in lvmlockd.
>> Hmm, is it safe to just skip the locking while the LV is active on other
>> node?
>> Is there somewhere in the code to avoid concurrent lvm command to execute
>> at the same time?
> The VG lock is still used to protect the VG metadata change.  The LV lock
> doesn't protect anything per se, it just represents that lvchange has
> activated the LV on this host.  (The LV lock does not represent the
> suspended/resumed state of the dm device either, as you suggested above.)

I see, thanks for you explanation!
> I'll send a simple patch to skip the lv lock to try this.

I've tested your patch and it works very well.  Thanks very much.

Regards,
Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-04  9:06       ` Eric Ren
@ 2018-01-09  2:42         ` Eric Ren
  2018-01-09 15:42           ` David Teigland
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Ren @ 2018-01-09  2:42 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland

Hi David,

On 01/04/2018 05:06 PM, Eric Ren wrote:
> David,
>
> On 01/03/2018 11:07 PM, David Teigland wrote:
>> On Wed, Jan 03, 2018 at 11:52:34AM +0800, Eric Ren wrote:
>>>> 1. one one node: lvextend --lockopt skip -L+1G VG/LV
>>>>
>>>>      That option doesn't exist, but illustrates the point that some 
>>>> new
>>>>      option could be used to skip the incompatible LV locking in 
>>>> lvmlockd.
>>> Hmm, is it safe to just skip the locking while the LV is active on 
>>> other
>>> node?
>>> Is there somewhere in the code to avoid concurrent lvm command to 
>>> execute
>>> at the same time?
>> The VG lock is still used to protect the VG metadata change. The LV lock
>> doesn't protect anything per se, it just represents that lvchange has
>> activated the LV on this host.  (The LV lock does not represent the
>> suspended/resumed state of the dm device either, as you suggested 
>> above.)
>
> I see, thanks for you explanation!
>> I'll send a simple patch to skip the lv lock to try this.
>
> I've tested your patch and it works very well.  Thanks very much.

Could you please consider to push this patch upstream? Also, Is this the 
same case
for pvmove as lvresize? If so, can we also work out a similar patch for 
pvmove?

Regards,
Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-09  2:42         ` Eric Ren
@ 2018-01-09 15:42           ` David Teigland
  2018-01-10  6:55             ` Eric Ren
  0 siblings, 1 reply; 11+ messages in thread
From: David Teigland @ 2018-01-09 15:42 UTC (permalink / raw)
  To: Eric Ren; +Cc: LVM general discussion and development

On Tue, Jan 09, 2018 at 10:42:27AM +0800, Eric Ren wrote:
> > I've tested your patch and it works very well.� Thanks very much.
> 
> Could you please consider to push this patch upstream?

OK

> Also, Is this the same case for pvmove as lvresize? If so, can we also
> work out a similar patch for pvmove?

Running pvmove on an LV active on multiple hosts could be allowed with the
same kind of patch.  However, it would need to use cmirror which we are
trying to phase out; the recent cluster raid1 has a more promising future.
So I think cmirror should be left in the clvm era and not brought forward.

Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-09 15:42           ` David Teigland
@ 2018-01-10  6:55             ` Eric Ren
  2018-01-10 15:56               ` David Teigland
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Ren @ 2018-01-10  6:55 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland

Hi David,

On 01/09/2018 11:42 PM, David Teigland wrote:
> On Tue, Jan 09, 2018 at 10:42:27AM +0800, Eric Ren wrote:
>>> I've tested your patch and it works very well.  Thanks very much.
>> Could you please consider to push this patch upstream?
> OK

Thanks very  much! So, can we update the `man 8 lvmlockd` to remove the 
below limitation
on lvresize?

"""
limitations of lockd VGs
...
   · resizing an LV that is active in the shared mode on multiple hosts
"""

>
>> Also, Is this the same case for pvmove as lvresize? If so, can we also
>> work out a similar patch for pvmove?
> Running pvmove on an LV active on multiple hosts could be allowed with the
> same kind of patch.  However, it would need to use cmirror which we are

OK, I see.

> trying to phase out; the recent cluster raid1 has a more promising future.

My understanding is:

if cluster raid1 is used as PV, data is replicated and data migration is 
nearly equivalent
to replace disk. However, in scenario PV is on raw disk, pvmove is very 
handy for data migration.

IIRC, you mean we can consider to use cluster raid1 as the underlaying 
DM target to support pvmove
used in cluster, since currect pvmove is using mirror target now?

> So I think cmirror should be left in the clvm era and not brought forward.

By the way, another thing I'd to ask about:   Do we really want to drop 
the concept of clvm?

 From my understanding, lvmlockd is going to replace only "clvmd" 
daemon, not clvm in exact.
clvm is apparently short for cluster/cluster-aware LVM, which is 
intuitive naming. I see clvm
as an abstract concept, which is consisted of two pieces: clvmd and 
cmirrord. IMHO, I'd like to
see the clvm concept remains, no matter what we deal with the clvmd and 
cmirrord. It might
be good for user or documentation to digest the change :)

Regards,
Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-10  6:55             ` Eric Ren
@ 2018-01-10 15:56               ` David Teigland
  2018-01-11  9:32                 ` Eric Ren
  0 siblings, 1 reply; 11+ messages in thread
From: David Teigland @ 2018-01-10 15:56 UTC (permalink / raw)
  To: Eric Ren; +Cc: LVM general discussion and development

On Wed, Jan 10, 2018 at 02:55:42PM +0800, Eric Ren wrote:
> if cluster raid1 is used as PV, data is replicated and data migration is
> nearly equivalent
> to replace disk. However, in scenario PV is on raw disk, pvmove is very
> handy for data migration.
> 
> IIRC, you mean we can consider to use cluster raid1 as the underlaying DM
> target to support pvmove
> used in cluster, since currect pvmove is using mirror target now?

That's what I imagined could be done, but I've not thought about it in
detail.  IMO pvmove under a shared LV is too complicated and not worth
doing.

> By the way, another thing I'd to ask about:�� Do we really want to drop
> the concept of clvm?
> 
> From my understanding, lvmlockd is going to replace only "clvmd" daemon,
> not clvm in exact.  clvm is apparently short for cluster/cluster-aware
> LVM, which is intuitive naming. I see clvm as an abstract concept, which
> is consisted of two pieces: clvmd and cmirrord. IMHO, I'd like to see
> the clvm concept remains, no matter what we deal with the clvmd and
> cmirrord. It might be good for user or documentation to digest the
> change :)

Thank you for pointing out the artifice in naming here, it has long
irritated me too.  There is indeed no such thing as "clvm" or "HA LVM",
and I think we'd be better off to ban these terms completely, at least at
the technical level.  (Historically, I suspect sales/marketing had a role
in this mess by wanting to attach a name to something to sell.)

If the term "clvm" survives, it will become even worse IMO if we expand it
to cover cases not using "clvmd".  To me it's all just "lvm", and I don't
see why we need any other names.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes
  2018-01-10 15:56               ` David Teigland
@ 2018-01-11  9:32                 ` Eric Ren
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Ren @ 2018-01-11  9:32 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland

Hi David,

>> IIRC, you mean we can consider to use cluster raid1 as the underlaying DM
>> target to support pvmove
>> used in cluster, since currect pvmove is using mirror target now?
> That's what I imagined could be done, but I've not thought about it in
> detail.  IMO pvmove under a shared LV is too complicated and not worth
> doing.

Very true.

>
>> By the way, another thing I'd to ask about:   Do we really want to drop
>> the concept of clvm?
>>
>>  From my understanding, lvmlockd is going to replace only "clvmd" daemon,
>> not clvm in exact.  clvm is apparently short for cluster/cluster-aware
>> LVM, which is intuitive naming. I see clvm as an abstract concept, which
>> is consisted of two pieces: clvmd and cmirrord. IMHO, I'd like to see
>> the clvm concept remains, no matter what we deal with the clvmd and
>> cmirrord. It might be good for user or documentation to digest the
>> change :)
> Thank you for pointing out the artifice in naming here, it has long
> irritated me too.  There is indeed no such thing as "clvm" or "HA LVM",
> and I think we'd be better off to ban these terms completely, at least at
> the technical level.  (Historically, I suspect sales/marketing had a role
> in this mess by wanting to attach a name to something to sell.)
Hha, like cluster MD raid.
>
> If the term "clvm" survives, it will become even worse IMO if we expand it
> to cover cases not using "clvmd".  To me it's all just "lvm", and I don't
> see why we need any other names.
It looks like people need a simple naming to distinguish the usage scenario:
local and cluster.

Thanks,
Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-01-11  9:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-28 10:42 [linux-lvm] lvmlockd: about the limitation on lvresizing the LV active on multiple nodes Eric Ren
2018-01-02  8:09 ` Eric Ren
2018-01-02 17:10 ` David Teigland
2018-01-03  3:52   ` Eric Ren
2018-01-03 15:07     ` David Teigland
2018-01-04  9:06       ` Eric Ren
2018-01-09  2:42         ` Eric Ren
2018-01-09 15:42           ` David Teigland
2018-01-10  6:55             ` Eric Ren
2018-01-10 15:56               ` David Teigland
2018-01-11  9:32                 ` Eric Ren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).