linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Damon Wang <damon.devops@gmail.com>
To: David Teigland <teigland@redhat.com>
Cc: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] [lvmlockd] recovery lvmlockd after kill_vg
Date: Fri, 28 Sep 2018 11:14:35 +0800	[thread overview]
Message-ID: <CABZYMH4eYC-r+wCPyC8r-xqzNbRr-OukRoweNAD2nej6wgDKyA@mail.gmail.com> (raw)
In-Reply-To: <20180927173550.GB2706@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4460 bytes --]

On Fri, Sep 28, 2018 at 1:35 AM David Teigland <teigland@redhat.com> wrote:
>
> On Thu, Sep 27, 2018 at 10:12:44PM +0800, Damon Wang wrote:
> > Thank you for your reply, I have another question under such
circumstances.
> >
> > I usually run "vgck" to check weather vg is good, but sometimes it
> > seems it stuck, and leave a VGLK on sanlock. (I'm sure io error will
> > cause it, but sometimes not because io error)
> > Then i'll try use sanlock client release -r xxx to release it, but it
> > also sometimes not work.(be stuck)
> > Then I may lvmlockctl -r to drop vg lockspace, but it still may stuck,
> > and I'm io is ok when it stuck
> >
> > This usually happens on multipath storage, I consider multipath will
> > queue some io is blamed, but not sure.
> >
> > Any idea?
>
> First, you might be able to avoid this issue by doing the check using
> something other than an lvm command, or perhaps and lvm command configured
> to avoid taking locks (the --nolocking option in vgs/pvs/lvs).  What's
> appropriate depends on specifically what you want to know from the check.
>

This is how I use sanlock and lvmlockd:


 +------------------+            +---------------------+
 +----------------+
 |                  |            |                     |         |
      |
 |     sanlock      <------------>     lvmlockd        <---------+  lvm
commands  |
 |                  |            |                     |         |
      |
 +------------------+            +---------------------+
 +----------------+
       |
       |
       |
       |      +------------------+
 +-----------------+        +------------+
       |      |                  |                               |
       |        |            |
       +------>     multipath    <- - -  -  -  -   -  -  -  -  - |  lvm
volumes    <--------+    qemu    |
              |                  |                               |
       |        |            |
              +------------------+
 +-----------------+        +------------+
                      |
                      |
                      |
                      |
                      |
              +------------------+
              |                  |
              |   san storage    |
              |                  |
              |                  |
              +------------------+

As I mentioned in first mail, sometimes I found lvm commands failed with
"sanlock lease storage failure", I guess this is because lvmlockd kill_vg
has triggered,
as the manual says, it should deactivate volumes and drop lockspace as
quick as possible, but I can't get a proper alert from a program way.

TTY can get a message, but it's not a good way to listen or monitor, so I
run vgck periodically and parse its stdout and stderr, once "sanlock lease
storage failure" or
something unusual happens, an alert will be triggered and I'll do some
check(I hope all this process can be automatically).

If do not require lock(pvs/lvs/vgs --nolocking), these error wont be
noticed, since lots of san storage configure multipath as queue io as far
as possible(multipath -t | grep queue_if_no_path),
get lvm error@early is pretty difficult, vgck and parse its output a way
with less load(it will get a shared vglk) and better efficiency(it should
take less than 0.1s in usual) after various tried.

As you mentioned, I'll extend io timeout to avoid storage jitter, and I
believe it also resolves some problems from multipath queue io.


> I still haven't fixed the issue you found earlier, which sounds like it
> could be the same or related to what you're describing now.
> https://www.redhat.com/archives/linux-lvm/2018-July/msg00011.html
>
> As for manually cleaning up a stray lock using sanlock client, there may
> be some limits on the situations that works in, I don't recall off hand.
> You should try using the -p <pid> option with client release to match the
> pid of lvmlockd.


yes I added -p to release lock, and I wanna summary up an "Emergency
Procedures" for deal with different storage failure, for me it's still
unclear now.
I'll do more experiment after fix these annoying storage fails, then make
this summary


> Configuring multipath to fail more quickly instead of queueing might give
> you a better chance of cleaning things up.
>
> Dave
>

yeah, I believe multipath queue io should be blamed, I'm negotiating with
storage vendor since they think multipath config is right :-(


Thank you for your patience!

Damon

[-- Attachment #2: Type: text/html, Size: 7443 bytes --]

  reply	other threads:[~2018-09-28  3:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-25 10:18 [linux-lvm] [lvmlockd] recovery lvmlockd after kill_vg Damon Wang
2018-09-25 16:44 ` David Teigland
2018-09-27 14:12   ` Damon Wang
2018-09-27 17:35     ` David Teigland
2018-09-28  3:14       ` Damon Wang [this message]
2018-09-28 14:32         ` David Teigland
2018-09-28 18:13           ` Damon Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABZYMH4eYC-r+wCPyC8r-xqzNbRr-OukRoweNAD2nej6wgDKyA@mail.gmail.com \
    --to=damon.devops@gmail.com \
    --cc=linux-lvm@redhat.com \
    --cc=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).