All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed
Date: Wed, 22 Sep 2021 15:54:49 +0200	[thread overview]
Message-ID: <CAHc6FU7KBVKQYev8fAuCt5p1ENczHqDdKV96xCKc_p1aowgk+Q@mail.gmail.com> (raw)
In-Reply-To: <7c83e1ac-5ec6-b008-51d0-11d978ec642f@redhat.com>

On Wed, Sep 22, 2021 at 2:47 PM Bob Peterson <rpeterso@redhat.com> wrote:
> On 9/22/21 6:57 AM, Andreas Gruenbacher wrote:
> > On Thu, Sep 16, 2021 at 9:11 PM Bob Peterson <rpeterso@redhat.com> wrote:
> >> Before this patch, when a glock was locked, the very first holder on the
> >> queue would unlock the lockref and call the go_lock glops function (if
> >> one exists), unless GL_SKIP was specified. When we introduced the new
> >> node-scope concept, we allowed multiple holders to lock glocks in EX mode
> >> and share the lock, but node-scope introduced a new problem: if the
> >> first holder has GL_SKIP and the next one does NOT, since it is not the
> >> first holder on the queue, the go_lock op was not called.
> >
> > We use go_lock to (re)validate inodes (for inode glocks) and to read
> > in bitmaps (for resource group glocks). I can see how calling go_lock
> > was originally tied to the first lock holder, but GL_SKIP already
> > broke the simple model that the first holder will call go_lock. The
> > go_lock_needed callback only makes things worse yet again,
> > unfortunately.
>
> In what way does go_lock_needed make things worse?

It adds an indirection that papers over the fact that the existing
abstraction (first holder calls go_lock) doesn't make sense.

> > How about we introduce a new GLF_REVALIDATE flag that indicates that
> > go_lock needs to be called? The flag would be set when instantiating a
> > new glock and when dequeuing the last holder, and cleared in go_lock
> > (and in gfs2_inode_refresh for GL_SKIP holders). I'm not sure if
>
> That was my original design, and it makes the most sense. I named the
> flag GLF_GO_LOCK_SKIPPED, but essentially the same thing. Unfortunately,
> I ran into all kinds of problems implementing it. In those patches,
> first holders would either call glops->go_lock() or set
> GLF_GO_LOCK_SKIPPED. Once the go_lock function was complete, it cleared
> GLF_GO_LOCK_SKIPPED, and called wake_up_bit. Secondary holders did
> wait_on_bit and waited for the other process's go_lock to complete.

Just set the flag when we know the glock needs revalidation. There are
two possible points in time for doing that: either when we're locking
the first holder, or when the glock is new / the last holder is
dequeued. Then, we can handle clearing the flag and races among
multiple go_lock instances in the go_lock handlers.

> But I had tons of problems getting this to work properly. Processes
> would hang and deadlock for seemingly no reason. Finally I got
> frustrated and sought other solutions.
>
> I'm willing to try to resurrect that patch set and try again. Maybe you
> can help me figure out what I'm doing wrong and why it's not working.
>
> Bob Peterson
>
> > GLF_REVALIDATE can fully replace GIF_INVALID as well, but it looks
> > like it at first glance.
> >
> > Thanks,
> > Andreas
>

Thanks,
Andreas



      reply	other threads:[~2021-09-22 13:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-16 19:09 [Cluster-devel] [GFS2 PATCH v2 0/6] gfs2: fix bugs related to node_scope and go_lock Bob Peterson
2021-09-16 19:09 ` [Cluster-devel] [GFS2 PATCH v2 1/6] gfs2: remove redundant check in gfs2_rgrp_go_lock Bob Peterson
2021-09-16 19:09 ` [Cluster-devel] [GFS2 PATCH v2 2/6] gfs2: Add GL_SKIP holder flag to dump_holder Bob Peterson
2021-09-16 19:10 ` [Cluster-devel] [GFS2 PATCH v2 3/6] gfs2: move GL_SKIP check from glops to do_promote Bob Peterson
2021-09-16 19:10 ` [Cluster-devel] [GFS2 PATCH v2 4/6] gfs2: Switch some BUG_ON to GLOCK_BUG_ON for debug Bob Peterson
2021-09-16 19:10 ` [Cluster-devel] [GFS2 PATCH v2 5/6] gfs2: simplify do_promote and fix promote trace Bob Peterson
2021-09-16 19:10 ` [Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed Bob Peterson
2021-09-22 11:57   ` Andreas Gruenbacher
2021-09-22 12:47     ` Bob Peterson
2021-09-22 13:54       ` Andreas Gruenbacher [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHc6FU7KBVKQYev8fAuCt5p1ENczHqDdKV96xCKc_p1aowgk+Q@mail.gmail.com \
    --to=agruenba@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.