From: Steven Whitehouse <swhiteho@redhat.com>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: cluster-devel <cluster-devel@redhat.com>, Jan Kara <jack@suse.cz>,
LKML <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@infradead.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
ocfs2-devel@oss.oracle.com
Subject: Re: [Ocfs2-devel] [Cluster-devel] [PATCH v6 10/19] gfs2: Introduce flag for glock holder auto-demotion
Date: Fri, 20 Aug 2021 14:47:54 +0100 [thread overview]
Message-ID: <d5fbfeff64cee4a2045e4e53abbd205618888044.camel@redhat.com> (raw)
In-Reply-To: <CAHc6FU7jz9z9FEu3gY0S2A2Rv6cQJzp7p_5NOnU3b8Zpz+QsVg@mail.gmail.com>
Hi,
On Fri, 2021-08-20 at 15:17 +0200, Andreas Gruenbacher wrote:
> On Fri, Aug 20, 2021 at 11:35 AM Steven Whitehouse <
> swhiteho@redhat.com> wrote:
> > On Thu, 2021-08-19 at 21:40 +0200, Andreas Gruenbacher wrote:
> > > From: Bob Peterson <rpeterso@redhat.com>
> > >
> > > This patch introduces a new HIF_MAY_DEMOTE flag and
> > > infrastructure
> > > that will allow glocks to be demoted automatically on locking
> > > conflicts.
> > > When a locking request comes in that isn't compatible with the
> > > locking
> > > state of a holder and that holder has the HIF_MAY_DEMOTE flag
> > > set, the
> > > holder will be demoted automatically before the incoming locking
> > > request
> > > is granted.
> >
> > I'm not sure I understand what is going on here. When there are
> > locking
> > conflicts we generate call backs and those result in glock
> > demotion.
> > There is no need for a flag to indicate that I think, since it is
> > the
> > default behaviour anyway. Or perhaps the explanation is just a bit
> > confusing...
>
> When a glock has active holders (with the HIF_HOLDER flag set), the
> glock won't be demoted to a state incompatible with any of those
> holders.
>
Ok, that is a much clearer explanation of what the patch does. Active
holders have always prevented demotions previously.
> > > Processes that allow a glock holder to be taken away indicate
> > > this by
> > > calling gfs2_holder_allow_demote(). When they need the glock
> > > again,
> > > they call gfs2_holder_disallow_demote() and then they check if
> > > the
> > > holder is still queued: if it is, they're still holding the
> > > glock; if
> > > it isn't, they need to re-acquire the glock.
> > >
> > > This allows processes to hang on to locks that could become part
> > > of a
> > > cyclic locking dependency. The locks will be given up when a
> > > (rare)
> > > conflicting locking request occurs, and don't need to be given up
> > > prematurely.
> >
> > This seems backwards to me. We already have the glock layer cache
> > the
> > locks until they are required by another node. We also have the min
> > hold time to make sure that we don't bounce locks too much. So what
> > is
> > the problem that you are trying to solve here I wonder?
>
> This solves the problem of faulting in pages during read and write
> operations: on the one hand, we want to hold the inode glock across
> those operations. On the other hand, those operations may fault in
> pages, which may require taking the same or other inode glocks,
> directly or indirectly, which can deadlock.
>
> So before we fault in pages, we indicate with
> gfs2_holder_allow_demote(gh) that we can cope if the glock is taken
> away from us. After faulting in the pages, we indicate with
> gfs2_holder_disallow_demote(gh) that we now actually need the glock
> again. At that point, we either still have the glock (i.e., the
> holder
> is still queued and it has the HIF_HOLDER flag set), or we don't.
>
> The different kinds of read and write operations differ in how they
> handle the latter case:
>
> * When a buffered read or write loses the inode glock, it returns a
> short result. This
> prevents torn writes and reading things that have never existed on
> disk in that form.
>
> * When a direct read or write loses the inode glock, it re-acquires
> it before resuming
> the operation. Direct I/O is not expected to return partial
> results
> and doesn't provide
> any kind of synchronization among processes.
>
> We could solve this kind of problem in other ways, for example, by
> keeping a glock generation number, dropping the glock before faulting
> in pages, re-acquiring it afterwards, and checking if the generation
> number has changed. This would still be an additional piece of glock
> infrastructure, but more heavyweight than the HIF_MAY_DEMOTE flag
> which uses the existing glock holder infrastructure.
This is working towards the "why" but could probably be summarised a
bit more. We always used to manage to avoid holding fs locks when
copying to/from userspace to avoid these complications. If that is no
longer possible then it would be good to document what the new
expectations are somewhere suitable in Documentation/filesystems/...
just so we make sure it is clear what the new system is, and everyone
will be on the same page,
Steve.
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
next prev parent reply other threads:[~2021-08-20 13:48 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-19 19:40 [Ocfs2-devel] [PATCH v6 00/19] gfs2: Fix mmap + page fault deadlocks Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 01/19] iov_iter: Fix iov_iter_get_pages{, _alloc} page fault return value Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 02/19] powerpc/kvm: Fix kvm_use_magic_page Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 03/19] Turn fault_in_pages_{readable, writeable} into fault_in_{readable, writeable} Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 04/19] Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 05/19] iov_iter: Introduce fault_in_iov_iter_writeable Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 06/19] gfs2: Add wrapper for iomap_file_buffered_write Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 07/19] gfs2: Clean up function may_grant Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 08/19] gfs2: Eliminate vestigial HIF_FIRST Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 09/19] gfs2: Remove redundant check from gfs2_glock_dq Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 10/19] gfs2: Introduce flag for glock holder auto-demotion Andreas Gruenbacher
2021-08-20 9:35 ` [Ocfs2-devel] [Cluster-devel] " Steven Whitehouse
2021-08-20 13:11 ` Bob Peterson
2021-08-20 13:41 ` Steven Whitehouse
2021-08-20 15:22 ` Andreas Gruenbacher
2021-08-23 8:14 ` Steven Whitehouse
2021-08-23 15:18 ` Andreas Gruenbacher
2021-08-23 16:05 ` Matthew Wilcox
2021-08-23 16:36 ` Bob Peterson
2021-08-23 19:12 ` Andreas Gruenbacher
2021-08-24 7:59 ` Steven Whitehouse
2021-08-20 13:17 ` Andreas Gruenbacher
2021-08-20 13:47 ` Steven Whitehouse [this message]
2021-08-20 14:43 ` Andreas Grünbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 11/19] gfs2: Move the inode glock locking to gfs2_file_buffered_write Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 12/19] gfs2: Fix mmap + page fault deadlocks for buffered I/O Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 13/19] iomap: Fix iomap_dio_rw return value for user copies Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 14/19] iomap: Support partial direct I/O on user copy failures Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 15/19] iomap: Add done_before argument to iomap_dio_rw Andreas Gruenbacher
2021-08-19 19:40 ` [Ocfs2-devel] [PATCH v6 16/19] gup: Introduce FOLL_NOFAULT flag to disable page faults Andreas Gruenbacher
2021-08-19 19:41 ` [Ocfs2-devel] [PATCH v6 17/19] iov_iter: Introduce nofault " Andreas Gruenbacher
2021-08-19 19:41 ` [Ocfs2-devel] [PATCH v6 18/19] gfs2: Fix mmap + page fault deadlocks for direct I/O Andreas Gruenbacher
2021-08-19 19:41 ` [Ocfs2-devel] [PATCH v6 19/19] gfs2: Eliminate ip->i_gh Andreas Gruenbacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5fbfeff64cee4a2045e4e53abbd205618888044.camel@redhat.com \
--to=swhiteho@redhat.com \
--cc=agruenba@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).