ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: David Laight <David.Laight@aculab.com>
Cc: cluster-devel <cluster-devel@redhat.com>, Jan Kara <jack@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"ocfs2-devel@oss.oracle.com" <ocfs2-devel@oss.oracle.com>
Subject: Re: [Ocfs2-devel] [PATCH v4 1/8] iov_iter: Introduce iov_iter_fault_in_writeable helper
Date: Tue, 27 Jul 2021 13:13:47 +0200	[thread overview]
Message-ID: <CAHc6FU4N7vz+jfoUSa45Mr_F0Ht0_PXroWoc5UNkMgFmpKLaNw@mail.gmail.com> (raw)
In-Reply-To: <03e0541400e946cf87bc285198b82491@AcuMS.aculab.com>

On Tue, Jul 27, 2021 at 11:30 AM David Laight <David.Laight@aculab.com> wrote:
> From: Linus Torvalds
> > Sent: 24 July 2021 20:53
> >
> > On Sat, Jul 24, 2021 at 12:35 PM Andreas Gruenbacher
> > <agruenba@redhat.com> wrote:
> > >
> > > +int iov_iter_fault_in_writeable(const struct iov_iter *i, size_t bytes)
> > > +{
> > ...
> > > +                       if (fault_in_user_pages(start, len, true) != len)
> > > +                               return -EFAULT;
> >
> > Looking at this once more, I think this is likely wrong.
> >
> > Why?
> >
> > Because any user can/should only care about at least *part* of the
> > area being writable.
> >
> > Imagine that you're doing a large read. If the *first* page is
> > writable, you should still return the partial read, not -EFAULT.
>
> My 2c...
>
> Is it actually worth doing any more than ensuring the first byte
> of the buffer is paged in before entering the block that has
> to disable page faults?

We definitely do want to process as many pages as we can, especially
if allocations are involved during a write.

> Most of the all the pages are present so the IO completes.

That's not guaranteed. There are cases in which none of the pages are
present, and then there are cases in which only the first page is
present (for example, because of a previous access that wasn't page
aligned).

> The pages can always get unmapped (due to page pressure or
> another application thread unmapping them) so there needs
> to be a retry loop.
> Given the cost of actually faulting in a page going around
> the outer loop may not matter.
> Indeed, if an application has just mmap()ed in a very large
> file and is then doing a write() from it then it is quite
> likely that the pages got unmapped!
>
> Clearly there needs to be extra code to ensure progress is made.
> This might actually require the use of 'bounce buffers'
> for really problematic user requests.

I'm not sure if repeated unmapping of the pages that we've just
faulted in is going to be a problem (in terms of preventing progress).
But a suitable heuristic might be to shrink the fault-in "window" on
each retry until it's only one page.

> I also wonder what actually happens for pipes and fifos.
> IIRC reads and write of up to PIPE_MAX (typically 4096)
> are expected to be atomic.
> This should be true even if there are page faults part way
> through the copy_to/from_user().
>
> It has to be said I can't see any reference to PIPE_MAX
> in the linux man pages, but I'm sure it is in the POSIX/TOG
> spec.
>
>         David

Thanks,
Andreas


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

  reply	other threads:[~2021-07-27 11:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-24 19:34 [Ocfs2-devel] [PATCH v4 0/8] gfs2: Fix mmap + page fault deadlocks Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 1/8] iov_iter: Introduce iov_iter_fault_in_writeable helper Andreas Gruenbacher
2021-07-24 19:52   ` Linus Torvalds
2021-07-24 20:24     ` Al Viro
2021-07-24 20:37       ` Linus Torvalds
2021-07-24 21:38       ` Andreas Gruenbacher
2021-07-24 21:57         ` Al Viro
2021-07-24 22:06           ` Andreas Gruenbacher
2021-07-24 23:39             ` Al Viro
2021-07-27  9:30     ` David Laight
2021-07-27 11:13       ` Andreas Gruenbacher [this message]
2021-07-27 17:51         ` Linus Torvalds
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 2/8] gfs2: Add wrapper for iomap_file_buffered_write Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 3/8] gfs2: Fix mmap + page fault deadlocks for buffered I/O Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 4/8] iomap: Fix iomap_dio_rw return value for user copies Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 5/8] iomap: Add done_before argument to iomap_dio_rw Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 6/8] iomap: Support restarting direct I/O requests after user copy failures Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 7/8] iov_iter: Introduce noio flag to disable page faults Andreas Gruenbacher
2021-07-24 19:34 ` [Ocfs2-devel] [PATCH v4 8/8] gfs2: Fix mmap + page fault deadlocks for direct I/O Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHc6FU4N7vz+jfoUSa45Mr_F0Ht0_PXroWoc5UNkMgFmpKLaNw@mail.gmail.com \
    --to=agruenba@redhat.com \
    --cc=David.Laight@aculab.com \
    --cc=cluster-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).