ocfs2-devel.oss.oracle.com archive mirror
 help / color / mirror / Atom feed
From: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: cluster-devel <cluster-devel@redhat.com>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	ocfs2-devel@oss.oracle.com
Subject: Re: [Ocfs2-devel] [PATCH v3 5/7] iomap: Support restarting direct I/O requests after user copy failures
Date: Mon, 26 Jul 2021 19:45:22 +0200	[thread overview]
Message-ID: <CAHpGcMLtQ1=WOT1mTUS4=iWBwHLQ-EBzY=+XuSGJfu4gVPYTLw@mail.gmail.com> (raw)
In-Reply-To: <20210726171940.GM20621@quack2.suse.cz>


[-- Attachment #1.1: Type: text/plain, Size: 3714 bytes --]

Jan Kara <jack@suse.cz> schrieb am Mo., 26. Juli 2021, 19:21:

> On Fri 23-07-21 22:58:38, Andreas Gruenbacher wrote:
> > In __iomap_dio_rw, when iomap_apply returns an -EFAULT error, complete
> the
> > request synchronously and reset the iterator to the start position.  This
> > allows callers to deal with the failure and retry the operation.
> >
> > In gfs2, we need to disable page faults while we're holding glocks to
> prevent
> > deadlocks.  This patch is the minimum solution I could find to make
> > iomap_dio_rw work with page faults disabled.  It's still expensive
> because any
> > I/O that was carried out before hitting -EFAULT needs to be retried.
> >
> > A possible improvement would be to add an IOMAP_DIO_FAULT_RETRY or
> similar flag
> > that would allow iomap_dio_rw to return a short result when hitting
> -EFAULT.
> > Callers could then retry only the rest of the request after dealing with
> the
> > page fault.
> >
> > Asynchronous requests turn into synchronous requests up to the point of
> the
> > page fault in any case, but they could be retried asynchronously after
> dealing
> > with the page fault.  To make that work, the completion notification
> would have
> > to include the bytes read or written before the page fault(s) as well,
> and we'd
> > need an additional iomap_dio_rw argument for that.
> >
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > ---
> >  fs/iomap/direct-io.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > index cc0b4bc8861b..b0a494211bb4 100644
> > --- a/fs/iomap/direct-io.c
> > +++ b/fs/iomap/direct-io.c
> > @@ -561,6 +561,15 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter
> *iter,
> >               ret = iomap_apply(inode, pos, count, iomap_flags, ops, dio,
> >                               iomap_dio_actor);
> >               if (ret <= 0) {
> > +                     if (ret == -EFAULT) {
> > +                             /*
> > +                              * To allow retrying the request, fail
> > +                              * synchronously and reset the iterator.
> > +                              */
> > +                             wait_for_completion = true;
> > +                             iov_iter_revert(dio->submit.iter,
> dio->size);
> > +                     }
> > +
>
> Hum, OK, but this means that if userspace submits large enough write, GFS2
> will livelock trying to complete it? While other filesystems can just
> submit multiple smaller bios constructed in iomap_apply() (paging in
> different parts of the buffer) and thus complete the write?
>

No. First, this affects reads but not writes. We cannot just blindly repeat
writes; when a page fault occurs in the middle of a write, the result will
be a short write. For reads, the plan is to ads a flag to allow
iomap_dio_rw to return a partial result when a page fault occurs.
(Currently, it fails the entire request.) Then we can handle the page fault
and complete the rest of the request.

The changes needed for that are simple on the iomap side, but we need to go
through some gymnastics for handling the page fault without giving up the
glock in the non-contended case. There will still be the potential for
losing the lock and having to re-acquire it, in which case we'll actually
have to repeat the entire read.

Thanks,
Andreas

                                                                Honza
>
> >                       /* magic error code to fall back to buffered I/O */
> >                       if (ret == -ENOTBLK) {
> >                               wait_for_completion = true;
> > --
> > 2.26.3
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>

[-- Attachment #1.2: Type: text/html, Size: 5035 bytes --]

[-- Attachment #2: Type: text/plain, Size: 151 bytes --]

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

  reply	other threads:[~2021-07-26 17:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-23 20:58 [Ocfs2-devel] [PATCH v3 0/7] gfs2: Fix mmap + page fault deadlocks Andreas Gruenbacher
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 1/7] iov_iter: Introduce fault_in_iov_iter helper Andreas Gruenbacher
2021-07-23 23:40   ` Linus Torvalds
2021-07-24  7:51     ` Andreas Grünbacher
2021-07-24  1:52   ` Al Viro
2021-07-24  8:05     ` Andreas Grünbacher
2021-07-26 16:33   ` Jan Kara
2021-07-26 17:15     ` Linus Torvalds
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 2/7] gfs2: Add wrapper for iomap_file_buffered_write Andreas Gruenbacher
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 3/7] gfs2: Fix mmap + page fault deadlocks for buffered I/O Andreas Gruenbacher
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 4/7] iomap: Fix iomap_dio_rw return value for user copies Andreas Gruenbacher
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 5/7] iomap: Support restarting direct I/O requests after user copy failures Andreas Gruenbacher
2021-07-26 17:19   ` Jan Kara
2021-07-26 17:45     ` Andreas Grünbacher [this message]
2021-07-26 18:08       ` Jan Kara
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 6/7] iov_iter: Introduce noio flag to disable page faults Andreas Gruenbacher
2021-07-23 20:58 ` [Ocfs2-devel] [PATCH v3 7/7] gfs2: Fix mmap + page fault deadlocks for direct I/O Andreas Gruenbacher
2021-07-26 17:02   ` Jan Kara
2021-07-26 17:50     ` Andreas Grünbacher
2021-07-26 18:00       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHpGcMLtQ1=WOT1mTUS4=iWBwHLQ-EBzY=+XuSGJfu4gVPYTLw@mail.gmail.com' \
    --to=andreas.gruenbacher@gmail.com \
    --cc=agruenba@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: [Ocfs2-devel] [PATCH v3 5/7] iomap: Support restarting direct I/O requests after user copy failures' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).