All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock
Date: Mon, 8 Oct 2018 09:25:40 -0400 (EDT)	[thread overview]
Message-ID: <1213809939.19185231.1539005140496.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <35e18368-cb90-421b-3998-949d00535000@redhat.com>

----- Original Message -----
> 
> 
> On 08/10/18 13:59, Mark Syms wrote:
> > That sounds entirely reasonable so long as you are absolutely sure that
> > nothing is ever going to mess with that glock, we erred on the side of
> > more caution not knowing whether it would be guaranteed safe or not.
> >
> > Thanks,
> >
> > 	Mark
> We should have a look at the history to see how that wait got added.
> However the "dead" flag here means "don't touch this glock" and is there
> so that we can separate the marking dead from the actual removal from
> the list (which simplifies the locking during the scanning procedures)
> 
> Steve.
(snip)
> > That is a bit odd. In fact that whole function looks odd. I wonder why it
> > needs to wait in the first place. It should be a simple comparison here.
> > If the glock is dead then nothing else should touch it, so we are safe to
> > add a new one into the hash table. The wait is almost certainly a bug,
> >
> > Steve.

Hi,

Andreas and I both did a ton of work here trying to get this right, and it
was all done for the problems we had with transitioning dinodes from
unlinked to free, and how the glocks beneath them were competing by way
of the in-core inodes. The glocks typically outlive the inodes, but we
had tons of problems with inodes coming and going, their associated glocks
coming and going, and being marked dead, and the rcus underneath them.

The problem is that inodes are coming and going, some with I_FREE, or
I_WILL_FREE. Glocks are also coming and going, often for the same block
at the same time, but two different inodes for two different files, and
both inode and iopen glocks.

The problems stem from remote unlinks causing "delete work" while the
inodes and glocks are both being allocated and freed in rapid succession.

We encountered lots of hangs and use-after-free problems. My  point is not
that we shouldn't fix it, but merely that we need to be VERY careful here
not to reintroduce one of the countless problems we fixed. Mark's original
patch seems pretty low risk to me, but if we go that route, I'd like to
see a smaller timeout; 1HZ seems like a very long time. Better still to
see if there's a better fix that doesn't break anything.

Bob Peterson



  parent reply	other threads:[~2018-10-08 13:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-08 12:36 [Cluster-devel] [PATCH 0/2] GFS2 locking patches Mark Syms
2018-10-08 12:36 ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Mark Syms
2018-10-08 12:56   ` Steven Whitehouse
2018-10-08 12:59     ` Mark Syms
2018-10-08 13:03       ` Steven Whitehouse
2018-10-08 13:10         ` Tim Smith
2018-10-08 13:13           ` Steven Whitehouse
2018-10-08 13:26             ` Tim Smith
2018-10-09  8:13               ` Mark Syms
2018-10-09  8:41                 ` Steven Whitehouse
2018-10-09  8:50                   ` Mark Syms
2018-10-09 12:34               ` Andreas Gruenbacher
2018-10-09 14:00                 ` Tim Smith
2018-10-09 14:47                   ` Andreas Gruenbacher
2018-10-09 15:34                     ` Tim Smith
2018-10-08 13:25         ` Bob Peterson [this message]
2018-10-08 12:36 ` [Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads Mark Syms
2018-10-08 12:57   ` Steven Whitehouse
2018-10-08 20:10   ` Bob Peterson
2018-10-08 20:53     ` Mark Syms
     [not found] ` <1603192.QCZcUHDx0E@dhcp-3-135.uk.xensource.com>
     [not found]   ` <CAHc6FU7AyDQ=yEj9WtN+aqtf7jfWs0TGh=Og0fQFVHyYMKHacA@mail.gmail.com>
2018-10-09 15:08     ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Tim Smith
2018-10-10  8:22       ` Tim Smith
2018-10-11  8:14         ` Mark Syms
2018-10-11 19:28           ` Mark Syms
     [not found]           ` <5bbfa461.1c69fb81.1242f.59a7SMTPIN_ADDED_BROKEN@mx.google.com>
2019-03-06 16:14             ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1213809939.19185231.1539005140496.JavaMail.zimbra@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.