linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <djwong@kernel.org>, Jan Kara <jack@suse.cz>,
	Matthew Wilcox <willy@infradead.org>,
	cluster-devel <cluster-devel@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"ocfs2-devel@oss.oracle.com" <ocfs2-devel@oss.oracle.com>,
	Josef Bacik <josef@toxicpanda.com>, Will Deacon <will@kernel.org>
Subject: Re: [RFC][arm64] possible infinite loop in btrfs search_ioctl()
Date: Thu, 21 Oct 2021 11:05:50 +0100	[thread overview]
Message-ID: <YXE7fhDkqJbfDk6e@arm.com> (raw)
In-Reply-To: <CAHc6FU7bpjAxP+4dfE-C0pzzQJN1p=C2j3vyXwUwf7fF9JF72w@mail.gmail.com>

On Thu, Oct 21, 2021 at 02:46:10AM +0200, Andreas Gruenbacher wrote:
> On Tue, Oct 12, 2021 at 1:59 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Mon, Oct 11, 2021 at 2:08 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > >
> > > +#ifdef CONFIG_ARM64_MTE
> > > +#define FAULT_GRANULE_SIZE     (16)
> > > +#define FAULT_GRANULE_MASK     (~(FAULT_GRANULE_SIZE-1))
> >
> > [...]
> >
> > > If this looks in the right direction, I'll do some proper patches
> > > tomorrow.
> >
> > Looks fine to me. It's going to be quite expensive and bad for caches, though.
> >
> > That said, fault_in_writable() is _supposed_ to all be for the slow
> > path when things go south and the normal path didn't work out, so I
> > think it's fine.
> 
> Let me get back to this; I'm actually not convinced that we need to
> worry about sub-page-size fault granules in fault_in_pages_readable or
> fault_in_pages_writeable.
> 
> From a filesystem point of view, we can get into trouble when a
> user-space read or write triggers a page fault while we're holding
> filesystem locks, and that page fault ends up calling back into the
> filesystem. To deal with that, we're performing those user-space
> accesses with page faults disabled.

Yes, this makes sense.

> When a page fault would occur, we
> get back an error instead, and then we try to fault in the offending
> pages. If a page is resident and we still get a fault trying to access
> it, trying to fault in the same page again isn't going to help and we
> have a true error.

You can't be sure the second fault is a true error. The unlocked
fault_in_*() may race with some LRU scheme making the pte not accessible
or a write-back making it clean/read-only. copy_to_user() with
pagefault_disabled() fails again but that's a benign fault. The
filesystem should re-attempt the fault-in (gup would correct the pte),
disable page faults and copy_to_user(), potentially in an infinite loop.
If you bail out on the second/third uaccess following a fault_in_*()
call, you may get some unexpected errors (though very rare). Maybe the
filesystems avoid this problem somehow but I couldn't figure it out.

> We're clearly looking at memory at a page
> granularity; faults at a sub-page level don't matter at this level of
> abstraction (but they do show similar error behavior). To avoid
> getting stuck, when it gets a short result or -EFAULT, the filesystem
> implements the following backoff strategy: first, it tries to fault in
> a number of pages. When the read or write still doesn't make progress,
> it scales back and faults in a single page. Finally, when that still
> doesn't help, it gives up. This strategy is needed for actual page
> faults, but it also handles sub-page faults appropriately as long as
> the user-space access functions give sensible results.

As I said above, I think with this approach there's a small chance of
incorrectly reporting an error when the fault is recoverable. If you
change it to an infinite loop, you'd run into the sub-page fault
problem.

There are some places with such infinite loops: futex_wake_op(),
search_ioctl() in the btrfs code. I still have to get my head around
generic_perform_write() but I think we get away here because it faults
in the page with a get_user() rather than gup (and copy_from_user() is
guaranteed to make progress if any bytes can still be accessed).

-- 
Catalin

  reply	other threads:[~2021-10-21 10:05 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-27 16:49 [PATCH v7 00/19] gfs2: Fix mmap + page fault deadlocks Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 01/19] iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value Andreas Gruenbacher
2021-09-09 11:09   ` Christoph Hellwig
2021-08-27 16:49 ` [PATCH v7 02/19] powerpc/kvm: Fix kvm_use_magic_page Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 03/19] gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable} Andreas Gruenbacher
2021-08-27 19:08   ` Al Viro
2021-09-03 14:56   ` Filipe Manana
2021-09-28 15:02     ` Andreas Gruenbacher
2021-09-28 16:37       ` Matthew Wilcox
2021-09-28 20:41         ` Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 04/19] iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable Andreas Gruenbacher
2021-08-27 18:53   ` Al Viro
2021-08-27 18:57     ` Linus Torvalds
2021-08-27 19:16       ` Al Viro
2021-08-27 20:56   ` Kari Argillander
2021-08-28 17:13     ` Linus Torvalds
2021-08-27 16:49 ` [PATCH v7 05/19] iov_iter: Introduce fault_in_iov_iter_writeable Andreas Gruenbacher
2021-08-27 18:49   ` Al Viro
2021-08-27 19:05     ` Linus Torvalds
2021-08-27 19:23       ` Al Viro
2021-08-27 19:33         ` Linus Torvalds
2021-08-27 19:37           ` Al Viro
2021-08-27 21:48             ` Al Viro
2021-08-27 21:57               ` Al Viro
2021-08-27 23:22                 ` Luck, Tony
2021-08-28  2:20                   ` Luck, Tony
2021-08-28 21:47                   ` Thomas Gleixner
2021-08-28 22:04                     ` Al Viro
2021-08-28 22:11                       ` Al Viro
2021-08-28 22:19                         ` Al Viro
2021-08-28 22:51                           ` Al Viro
2021-08-29 18:44                             ` Thomas Gleixner
2021-08-29 19:46                               ` Al Viro
2021-08-29 19:51                                 ` Thomas Gleixner
2021-08-28 22:20                         ` Tony Luck
2021-08-29  1:40                           ` Matthew Wilcox
2021-08-30 15:41                             ` Luck, Tony
2021-08-28 22:23                       ` Thomas Gleixner
2021-08-28 19:28               ` [RFC][arm64] possible infinite loop in btrfs search_ioctl() Al Viro
2021-08-31 13:54                 ` Catalin Marinas
2021-08-31 15:28                   ` Al Viro
2021-08-31 16:01                     ` Catalin Marinas
2021-10-11 17:37                     ` Catalin Marinas
2021-10-11 19:15                       ` Linus Torvalds
2021-10-11 21:08                         ` Catalin Marinas
2021-10-11 23:59                           ` Linus Torvalds
2021-10-12 17:27                             ` Catalin Marinas
2021-10-12 17:58                               ` Linus Torvalds
2021-10-18 17:13                                 ` Catalin Marinas
2021-10-21  0:46                             ` Andreas Gruenbacher
2021-10-21 10:05                               ` Catalin Marinas [this message]
2021-10-21 14:42                                 ` Andreas Gruenbacher
2021-10-21 17:09                                   ` Catalin Marinas
2021-10-21 18:00                                     ` Andreas Gruenbacher
2021-10-22 18:41                                       ` Catalin Marinas
2021-10-25 19:37                                         ` Andreas Gruenbacher
2021-10-22  2:30                                   ` Linus Torvalds
2021-10-22  9:34                                     ` Catalin Marinas
2021-08-29  0:58               ` [PATCH v7 05/19] iov_iter: Introduce fault_in_iov_iter_writeable Al Viro
2021-08-27 16:49 ` [PATCH v7 06/19] gfs2: Add wrapper for iomap_file_buffered_write Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 07/19] gfs2: Clean up function may_grant Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 08/19] gfs2: Eliminate vestigial HIF_FIRST Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 09/19] gfs2: Remove redundant check from gfs2_glock_dq Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 10/19] gfs2: Introduce flag for glock holder auto-demotion Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 11/19] gfs2: Move the inode glock locking to gfs2_file_buffered_write Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 12/19] gfs2: Eliminate ip->i_gh Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 13/19] gfs2: Fix mmap + page fault deadlocks for buffered I/O Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 14/19] iomap: Fix iomap_dio_rw return value for user copies Andreas Gruenbacher
2021-09-03 18:54   ` Darrick J. Wong
2021-09-09 11:17   ` Christoph Hellwig
2021-08-27 16:49 ` [PATCH v7 15/19] iomap: Support partial direct I/O on user copy failures Andreas Gruenbacher
2021-09-03 18:54   ` Darrick J. Wong
2021-09-09 11:20   ` Christoph Hellwig
2021-09-28 15:05     ` Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 16/19] iomap: Add done_before argument to iomap_dio_rw Andreas Gruenbacher
2021-08-27 18:30   ` Darrick J. Wong
2021-08-27 20:15     ` Andreas Gruenbacher
2021-08-27 21:32       ` Darrick J. Wong
2021-08-27 21:49         ` Andreas Grünbacher
2021-08-27 22:35         ` Linus Torvalds
2021-09-03 18:47           ` Darrick J. Wong
2021-09-03 18:53   ` Darrick J. Wong
2021-09-09 11:30   ` Christoph Hellwig
2021-09-09 17:22     ` Linus Torvalds
2021-09-10  7:36       ` Christoph Hellwig
2021-08-27 16:49 ` [PATCH v7 17/19] gup: Introduce FOLL_NOFAULT flag to disable page faults Andreas Gruenbacher
2021-09-09 11:36   ` Christoph Hellwig
2021-09-09 17:17     ` Linus Torvalds
2021-09-10  7:24       ` Christoph Hellwig
2021-08-27 16:49 ` [PATCH v7 18/19] iov_iter: Introduce nofault " Andreas Gruenbacher
2021-08-27 18:47   ` Al Viro
2021-08-27 19:56     ` Andreas Gruenbacher
2021-08-27 16:49 ` [PATCH v7 19/19] gfs2: Fix mmap + page fault deadlocks for direct I/O Andreas Gruenbacher
2021-08-27 17:16 ` [PATCH v7 00/19] gfs2: Fix mmap + page fault deadlocks Linus Torvalds
2021-09-01 19:52   ` Andreas Gruenbacher
2021-09-03 15:52     ` Linus Torvalds
2021-09-03 18:25       ` Al Viro
2021-09-03 18:47         ` Linus Torvalds
2021-09-03 15:07 ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXE7fhDkqJbfDk6e@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=agruenba@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).