All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-lvm] has anyone used LVM in a HA cluster?
@ 2002-05-09 11:53 Au, Richard
  2002-05-09 12:14 ` Tim
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Au, Richard @ 2002-05-09 11:53 UTC (permalink / raw)
  To: linux-lvm

Hi,

I'm wondering if anyone has used LVM in a high-availibility cluster
where two servers are connected to shared storage (the physical
volumes).  If so, which cluster solution did you use?  Will there be
problems if the logical volumes are visable to both servers, even if
only one of them has them mounted?  Thanks!

Richard Au
rau@archway.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-09 11:53 [linux-lvm] has anyone used LVM in a HA cluster? Au, Richard
@ 2002-05-09 12:14 ` Tim
  2002-05-09 14:38 ` Chad Walstrom
  2002-05-10  2:24 ` Patrick Caulfield
  2 siblings, 0 replies; 15+ messages in thread
From: Tim @ 2002-05-09 12:14 UTC (permalink / raw)
  To: linux-lvm

Quoth Au, Richard:
> Hi,
> 
> I'm wondering if anyone has used LVM in a high-availibility cluster
> where two servers are connected to shared storage (the physical
> volumes).  If so, which cluster solution did you use?  Will there be
> problems if the logical volumes are visable to both servers, even if
> only one of them has them mounted?  Thanks!

1) heartbeat (linux-ha; article found a while ago on samag.com)
2) not yet :-) but we are also doing offsite replication of the site,
   so it might be said that we can afford to take more risks than some.


-- 
    "The most valuable piece of equipment in the darkroom
     is the trash can."
                                  --Ansel Adams

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-09 11:53 [linux-lvm] has anyone used LVM in a HA cluster? Au, Richard
  2002-05-09 12:14 ` Tim
@ 2002-05-09 14:38 ` Chad Walstrom
  2002-05-09 14:46   ` Goetz Bock
  2002-07-13  7:12   ` Wichert Akkerman
  2002-05-10  2:24 ` Patrick Caulfield
  2 siblings, 2 replies; 15+ messages in thread
From: Chad Walstrom @ 2002-05-09 14:38 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 1421 bytes --]

On Thu, May 09, 2002 at 09:54:40AM -0700, Au, Richard wrote:
> I'm wondering if anyone has used LVM in a high-availibility cluster
> where two servers are connected to shared storage (the physical
> volumes).  If so, which cluster solution did you use?  Will there be
> problems if the logical volumes are visable to both servers, even if
> only one of them has them mounted?  Thanks!

Doing something like this w/o having servers participate in some sort of
locking or journaling scheme is somewhat scary.  You'd have to be pretty
careful on how you access a shared physical storage system in this
manner.

I believe Sistina's GFS project might be more to your liking. IIRC, it's
a clustered, journaling filesystem designed explicitly for your setup.
It's a bonus that it is also GPL.  Check it out at:
http://www.sistina.com/products_gfs.htm.  The latest GPL version of GFS
seems to be 4.1.1 (as present on ftp://ftp.sistina.com/pub/GFS).  I'm
not sure, but I'm assuming from the info available on the website that
GFS 5.0.1 is commercially available as well.  Is Sistina doing something
similar to TrollTech w/licensing GFS, then?  Or are they just behind the
ball as far as releasing the newer versions to the ftp site?  (I'm
assuming the former.)

-- 
Chad Walstrom <chewie@wookimus.net>                 | a.k.a. ^chewie
http://www.wookimus.net/                            | s.k.a. gunnarr

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-09 14:38 ` Chad Walstrom
@ 2002-05-09 14:46   ` Goetz Bock
  2002-07-13  7:12   ` Wichert Akkerman
  1 sibling, 0 replies; 15+ messages in thread
From: Goetz Bock @ 2002-05-09 14:46 UTC (permalink / raw)
  To: linux-lvm; +Cc: Chad Walstrom

On Thu, May 09 '02 at 14:35, Chad Walstrom wrote:
> I believe Sistina's GFS project might be more to your liking. IIRC, it's
> a clustered, journaling filesystem designed explicitly for your setup.
> It's a bonus that it is also GPL.  Check it out at:
It was GPL, it no longer is. There is an old version (4.1.1), but the
latest one is commercially only.
OTOH there is openGFS (www.openGFS.org) what continues the GPLed
version.

Than again, if you've money, it might be a nice way to thank sistina for
what they did, by using the commerciall version (you'd get support, too).

Cu,
    Goetz.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-09 11:53 [linux-lvm] has anyone used LVM in a HA cluster? Au, Richard
  2002-05-09 12:14 ` Tim
  2002-05-09 14:38 ` Chad Walstrom
@ 2002-05-10  2:24 ` Patrick Caulfield
  2002-05-10 17:51   ` Austin Gonyou
  2 siblings, 1 reply; 15+ messages in thread
From: Patrick Caulfield @ 2002-05-10  2:24 UTC (permalink / raw)
  To: linux-lvm

On Thu, May 09, 2002 at 09:54:40AM -0700, Au, Richard wrote:
> Hi,
> 
> I'm wondering if anyone has used LVM in a high-availibility cluster
> where two servers are connected to shared storage (the physical
> volumes).  If so, which cluster solution did you use?  Will there be
> problems if the logical volumes are visable to both servers, even if
> only one of them has them mounted?  Thanks!

Provided you're either using GFS as the file system or being VERY careful to
mount the filesystem on only one node at a time you can do this.

The key is just to be VERY careful. If you need to do any LVM commands you MUST

umount filesystems on all other nodes
vgchange -an on all other nodes

do the LVM metadata changes

vgscan on all nodes
vgchange -ay on all nodes.

The safe thing to do is to have only one node have the LVM commands available to
it (apart from vgscan & vgchange) and be VERY careful.

I'll say that again: Be VERY careful !

patrick

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-10  2:24 ` Patrick Caulfield
@ 2002-05-10 17:51   ` Austin Gonyou
  2002-05-10 19:45     ` Steven Lembark
  0 siblings, 1 reply; 15+ messages in thread
From: Austin Gonyou @ 2002-05-10 17:51 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 2355 bytes --]

From what I understand you *can* mount the LVM volumes on multiple hosts
at the same time, but you should be *readonly* on the hosts which do not
*own* the data on the disks. You could then *remount* the volume on the
target failover host. 
The only draw back is you will need to have monitoring, in the end
anyway, that will allow you to see:

1. What hosts have what volumes mounted
2. and in what mode(ro/rw?)

Also, you would need to use a filesystem that supports:

1. multiple hosts mounting it
2. filesystem 'remount' option

Talk to the guys on the Linux FailSafe list, they might be able to help
point you as well.

On Fri, 2002-05-10 at 02:26, Patrick Caulfield wrote:
> On Thu, May 09, 2002 at 09:54:40AM -0700, Au, Richard wrote:
> > Hi,
> > 
> > I'm wondering if anyone has used LVM in a high-availibility
> cluster
> > where two servers are connected to shared storage (the physical
> > volumes).  If so, which cluster solution did you use?  Will there
> be
> > problems if the logical volumes are visable to both servers, even
> if
> > only one of them has them mounted?  Thanks!
> 
> Provided you're either using GFS as the file system or being VERY
> careful to
> mount the filesystem on only one node at a time you can do this.
> 
> The key is just to be VERY careful. If you need to do any LVM
> commands you MUST
> 
> umount filesystems on all other nodes
> vgchange -an on all other nodes
> 
> do the LVM metadata changes
> 
> vgscan on all nodes
> vgchange -ay on all nodes.
> 
> The safe thing to do is to have only one node have the LVM commands
> available to
> it (apart from vgscan & vgchange) and be VERY careful.
> 
> I'll say that again: Be VERY careful !
> 
> patrick
> 
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
-- 
Austin Gonyou
Systems Architect, CCNA
Coremetrics, Inc.
Phone: 512-698-7250
email: austin@coremetrics.com

"One ought never to turn one's back on a threatened danger and 
try to run away from it. If you do that, you will double the danger. 
But if you meet it promptly and without flinching, you will 
reduce the danger by half."
Sir Winston Churchill

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-10 17:51   ` Austin Gonyou
@ 2002-05-10 19:45     ` Steven Lembark
  2002-05-11 14:26       ` Luca Berra
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Lembark @ 2002-05-10 19:45 UTC (permalink / raw)
  To: linux-lvm


-- Austin Gonyou <austin@coremetrics.com>

>> From what I understand you *can* mount the LVM volumes on multiple hosts

> at the same time, but you should be *readonly* on the hosts which do not
> *own* the data on the disks. You could then *remount* the volume on the
> target failover host.

I've had reasonable luck with this approach on other
platforms (HP-UX, but the LVM is similar enough that
it should work).

All it really needs is a "heartbeatd": you open a
socket to, say, echo (port 1) and every second write
a byte. If you don't get the byte back in, say, 100ms
then you try one more time before re-mounting the LV
rw on the backup host.

This can all be done w/ sockets in perl with about
20 lines of code. If both hosts monitor each other
then either monitor can scream for help.


--
Steven Lembark                               2930 W. Palmer
Workhorse Computing                       Chicago, IL 60647
                                            +1 800 762 1582

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-10 19:45     ` Steven Lembark
@ 2002-05-11 14:26       ` Luca Berra
  2002-05-13  4:15         ` Michael Lausch
  0 siblings, 1 reply; 15+ messages in thread
From: Luca Berra @ 2002-05-11 14:26 UTC (permalink / raw)
  To: linux-lvm

On Fri, May 10, 2002 at 07:44:44PM -0500, Steven Lembark wrote:
> 
> 
> -- Austin Gonyou <austin@coremetrics.com>
> 
> >>From what I understand you *can* mount the LVM volumes on multiple hosts
> 
> >at the same time, but you should be *readonly* on the hosts which do not
> >*own* the data on the disks. You could then *remount* the volume on the
> >target failover host.
> 
> I've had reasonable luck with this approach on other
> platforms (HP-UX, but the LVM is similar enough that
> it should work).

well actually a lot of details about the filesystem are cached on
an host, so you get relly funny results when you dont expect
filesystem metadata to change under your nose. I would not be
surprised to see a box panic under these circumstances.

> All it really needs is a "heartbeatd": you open a
> socket to, say, echo (port 1) and every second write
> a byte. If you don't get the byte back in, say, 100ms
> then you try one more time before re-mounting the LV
> rw on the backup host.

there is a tool called heartbeat on http://www.linux-ha.org/
have a look at it it really rocks.
it already has a resource that supports mounting of partitions
on shared media in case an hosts fail, it would be trivial
to adapt it for LVM

L.


-- 
Luca Berra -- bluca@comedia.it
        Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-11 14:26       ` Luca Berra
@ 2002-05-13  4:15         ` Michael Lausch
  2002-05-13  6:54           ` Luca Berra
  2002-05-13  7:33           ` Tim
  0 siblings, 2 replies; 15+ messages in thread
From: Michael Lausch @ 2002-05-13  4:15 UTC (permalink / raw)
  To: linux-lvm

Luca Berra wrote:
> On Fri, May 10, 2002 at 07:44:44PM -0500, Steven Lembark wrote:
> 
> 
> there is a tool called heartbeat on http://www.linux-ha.org/
> have a look at it it really rocks.
> it already has a resource that supports mounting of partitions
> on shared media in case an hosts fail, it would be trivial
> to adapt it for LVM

And make sure the daemons are running in the real time scheduler class.
There exists a state, commomly refered to as "split brain", where the 
nodes of a cluster "think" the other one is down, which is not the fact. 
Reason for this may be that the load is so high, that the heartbeat 
daemon is not scheduled in time to answer the requests (it happened to 
me with a commercial product). then both nodes mount the filesystem. 
Usually the inital fsck (or log replay or whatever) is enough to destroy 
the filesystem beyond repair. But all these things are in no way LVM 
specific, so it works.

Plan the export and import scenarios (what commands must be executed by 
the node having the disk mounted and what commands must be executed 
(vgscan and friends) on the node which mounts a filesystem respectivly) 
very carefully and take all possible failover scenarios into account, 
e.g. the active node crashes hard and yu have to scan the volume groups, 
do the fsck combo on dirty filesystems and mount the volumes.

my experience stems from veritas HA, veritas cluster (which uses a 
kernel module for heartbeats to work around the scheduler problems) and 
various versions of veritas volume manager. the volume manager is very 
similar to a MD/LVM combination.

> 
> L.
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-13  4:15         ` Michael Lausch
@ 2002-05-13  6:54           ` Luca Berra
  2002-05-13  7:33           ` Tim
  1 sibling, 0 replies; 15+ messages in thread
From: Luca Berra @ 2002-05-13  6:54 UTC (permalink / raw)
  To: linux-lvm

On Mon, May 13, 2002 at 11:16:50AM +0200, Michael Lausch wrote:
> And make sure the daemons are running in the real time scheduler class.
> There exists a state, commomly refered to as "split brain", where the 
> nodes of a cluster "think" the other one is down, which is not the fact. 
you are very right!
I believe the best solution would be having a lock manager
that uses the shared storage to provide a lock.
Another option is to have an external node which funcions
as an arbitrator. (i.e. the external quorum found in HP MC/SG
for Linux)
Last option, which as i rekon is the only one that can be implemented
with heartbeat atm is having one node effectively kill the other one,
by turning off the power. But it would make me very nervous.


L.

-- 
Luca Berra -- bluca@comedia.it
        Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-13  4:15         ` Michael Lausch
  2002-05-13  6:54           ` Luca Berra
@ 2002-05-13  7:33           ` Tim
  2002-05-13 10:59             ` Michael Lausch
  1 sibling, 1 reply; 15+ messages in thread
From: Tim @ 2002-05-13  7:33 UTC (permalink / raw)
  To: linux-lvm

> And make sure the daemons are running in the real time scheduler class.
> There exists a state, commomly refered to as "split brain", where the 
> nodes of a cluster "think" the other one is down, which is not the fact. 
> Reason for this may be that the load is so high, that the heartbeat 
> daemon is not scheduled in time to answer the requests (it happened to 
> me with a commercial product). then both nodes mount the filesystem. 
> Usually the inital fsck (or log replay or whatever) is enough to destroy 
> the filesystem beyond repair. But all these things are in no way LVM 
> specific, so it works.

Or don't, and buy a power supply that you can control from serial, and
do so -- STONITH, it's called -- Shoot The Other Node In The Head.  Once
the power is off, there is no danger of fsck'ing...

It's a rather elegant way to solve that problem, IMHO.

We're going that route.

-- 
    "The most valuable piece of equipment in the darkroom
     is the trash can."
                                  --Ansel Adams

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-13  7:33           ` Tim
@ 2002-05-13 10:59             ` Michael Lausch
  2002-05-13 11:23               ` Steven Lembark
  2002-05-13 11:28               ` Tim
  0 siblings, 2 replies; 15+ messages in thread
From: Michael Lausch @ 2002-05-13 10:59 UTC (permalink / raw)
  To: linux-lvm

Tim wrote:

> Or don't, and buy a power supply that you can control from serial, and
> do so -- STONITH, it's called -- Shoot The Other Node In The Head.  Once
> the power is off, there is no danger of fsck'ing...
> 
> It's a rather elegant way to solve that problem, IMHO.
> 
> We're going that route.
> 

which may backfire, if both nodes think the other one is down (split 
brain again) and start the shutdown procedure. okay, this is a very rare 
situation, and may happen only under strange load and scheduling 
parameters, but it will happen as any other "very rare situation" 
happens ;-). especially in HA environments they seem to happen much more 
often then in simple single point of failure environments ;-) you won'Ät 
loosew your filesystem, but the service is unavailable.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-13 10:59             ` Michael Lausch
@ 2002-05-13 11:23               ` Steven Lembark
  2002-05-13 11:28               ` Tim
  1 sibling, 0 replies; 15+ messages in thread
From: Steven Lembark @ 2002-05-13 11:23 UTC (permalink / raw)
  To: linux-lvm

> which may backfire, if both nodes think the other one is down (split
> brain again) and start the shutdown procedure. okay, this is a very rare
> situation, and may happen only under strange load and scheduling
> parameters, but it will happen as any other "very rare situation" happens
> ;-). especially in HA environments they seem to happen much more often
> then in simple single point of failure environments ;-) you won'Ät loosew
> your filesystem, but the service is unavailable.

Nothing will give you 100%: eventually the switchover methods
introduce more marginal P(fail) than the original setup had.
Joy of reliability studies is figuring how to take the first
partial of P(Fail) w/ respect to the switchover systems and
set it to zero... Turns out the most reliable answer is a swag
anyway :-)

--
Steven Lembark                               2930 W. Palmer
Workhorse Computing                       Chicago, IL 60647
                                            +1 800 762 1582

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-13 10:59             ` Michael Lausch
  2002-05-13 11:23               ` Steven Lembark
@ 2002-05-13 11:28               ` Tim
  1 sibling, 0 replies; 15+ messages in thread
From: Tim @ 2002-05-13 11:28 UTC (permalink / raw)
  To: linux-lvm

> > We're going that route.
> > 
> 
> which may backfire, if both nodes think the other one is down (split 
> brain again) and start the shutdown procedure. okay, this is a very rare 
> situation, and may happen only under strange load and scheduling 
> parameters, but it will happen as any other "very rare situation" 
> happens ;-). especially in HA environments they seem to happen much more 
> often then in simple single point of failure environments ;-) you won'�t 
> loosew your filesystem, but the service is unavailable.

Probably possible, unlikely in practice, however.  In any event, we have
constructed a load-sharing off-site node that recieves redirected
traffic via GSLB if the main node dies, so we have some time to wander
over to the datacenter and kick one of the HA nodes.  Writes can't
happen for a few minutes, but we're not recording financial transactions
directly, and email notices of any pending purchases just queue up in
the meantime.  This solution was not really what I had initially
planned, but it was a stipulation of a large contract we bid on (and
won), so we just went ahead and built it.  Better to hobble along for a
while during a problem, than to just fall over dead.  Meanwhile, the
issue of incremental backups (eg. to recover from user errors) is, at
least in my environment, fairly well handled by using CVS for code and
nightly rsync's for so-called 'static' files (a misnomer, but 'static'
in the sense that we only keep one revision kicking around :-)).

I've seen an awful lot of high-availability systems in production, and
the one I liked most (because it never seemed to cause problems) was the
IBM HACMP cluster(s) at IBM Microelectronics.  Until that becomes a
reasonable possibility for Linux, I guess we'll just stick to multiple
levels of redundancy and failover, the good old 'two safety nets are
better than one' theory...

The HACMP guys were the ones who suggested STONITH for Linux; while your
split-brain scenario could lead to both nodes losing power, this is not
as big of an issue (with journaled filesystems) as would be corruption.


-- 
    "The most valuable piece of equipment in the darkroom
     is the trash can."
                                  --Ansel Adams

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] has anyone used LVM in a HA cluster?
  2002-05-09 14:38 ` Chad Walstrom
  2002-05-09 14:46   ` Goetz Bock
@ 2002-07-13  7:12   ` Wichert Akkerman
  1 sibling, 0 replies; 15+ messages in thread
From: Wichert Akkerman @ 2002-07-13  7:12 UTC (permalink / raw)
  To: linux-lvm; +Cc: Chad Walstrom

Previously Chad Walstrom wrote:
> I believe Sistina's GFS project might be more to your liking. IIRC, it's
> a clustered, journaling filesystem designed explicitly for your setup.
> It's a bonus that it is also GPL.

Newer releases are not GPL, look at the OpenGFS project instead.

Wichert.

-- 
  _________________________________________________________________
 /wichert@wiggy.net         This space intentionally left occupied \
| wichert@deephackmode.org            http://www.liacs.nl/~wichert/ |
| 1024D/2FA3BC2D 576E 100B 518D 2F16 36B0  2805 3CB8 9250 2FA3 BC2D |

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2002-07-13  7:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-09 11:53 [linux-lvm] has anyone used LVM in a HA cluster? Au, Richard
2002-05-09 12:14 ` Tim
2002-05-09 14:38 ` Chad Walstrom
2002-05-09 14:46   ` Goetz Bock
2002-07-13  7:12   ` Wichert Akkerman
2002-05-10  2:24 ` Patrick Caulfield
2002-05-10 17:51   ` Austin Gonyou
2002-05-10 19:45     ` Steven Lembark
2002-05-11 14:26       ` Luca Berra
2002-05-13  4:15         ` Michael Lausch
2002-05-13  6:54           ` Luca Berra
2002-05-13  7:33           ` Tim
2002-05-13 10:59             ` Michael Lausch
2002-05-13 11:23               ` Steven Lembark
2002-05-13 11:28               ` Tim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.