From: Catalin Marinas <catalin.marinas@arm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Will Deacon <will@kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
Date: Mon, 29 Nov 2021 20:56:08 +0000 [thread overview]
Message-ID: <YaU+aDG5pCAba57r@arm.com> (raw)
In-Reply-To: <CAHk-=wiZgAgcynfLsop+D1xBUAZ-Z+NUBxe9mb-AedecFRNm+w@mail.gmail.com>
On Mon, Nov 29, 2021 at 10:40:38AM -0800, Linus Torvalds wrote:
> On Mon, Nov 29, 2021 at 7:36 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > That's what this series does when it probes the whole range in
> > fault_in_writeable(). The main reason was that it's more efficient to do
> > a read than a write on a large range (the latter dirtying the cache
> > lines).
>
> The more this thread goes on, the more I'm starting to think that we
> should just make "fault_in_writable()" (and readable, of course) only
> really work on the beginning of the area.
>
> Not just for the finer-granularity pointer color probing, but for the
> page probing too.
I have patches for the finer-granularity checking of the beginning of
the buffer. They need a bit of testing, so probably posting them
tomorrow.
> I'm looking at our current fault_in_writeable(), and I'm going
>
> (a) it uses __put_user() without range checks, which is really not great
For arm64 at least __put_user() does the access_ok() check. I thought
only unsafe_put_user() should skip the checks. If __put_user() can write
arbitrary memory, we may have a bigger problem.
> (b) it looks like a disaster from another standpoint: essentially
> user-controlled loop size with no limit checking, no preemption, and
> no check for fatal signals.
Indeed, the fault_in_*() loop can get pretty long, bounded by how much
memory can be faulted in the user process. My patches for now only
address the outer loop doing the copy_to_user() as that can be
unbounded.
> Now, (a) should be fixed with a access_ok() or similar.
>
> And (b) can easily be fixed multiple ways, with one option simply just
> being adding a can_resched() call and checking for fatal signals.
>
> But faulting in the whole region is actually fundamentally wrong in
> low-memory situations - the beginning of the region might be swapped
> out by the time we get to the end. That's unlikely to be a problem in
> real life, but it's an example of how it's simply not conceptually
> sensible.
>
> So I do wonder why we don't just say "fault_in_writable will fault in
> _at_most_ X bytes", and simply limit the actual fault-in size to
> something reasonable.
>
> That solves _all_ the problems. It solves the lack of preemption and
> fatal signals (by virtue of just limiting the amount of work we do).
> It solves the low memory situation. And it solves the "excessive dirty
> cachelines" case too.
I think that would be useful, though it doesn't solve the potential
livelock with sub-page faults. We still need the outer loop to
handle the copy_to_user() for the whole user buffer and the sub-page
probing of the beginning of such buffer (or whenever copy_to_user()
failed). IOW, you still fault in the whole buffer eventually.
Anyway, I think the sub-page probing and limiting the fault-in are
complementary improvements.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-11-29 21:23 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-24 19:20 [PATCH 0/3] Avoid live-lock in fault-in+uaccess loops with sub-page faults Catalin Marinas
2021-11-24 19:20 ` [PATCH 1/3] mm: Introduce fault_in_exact_writeable() to probe for " Catalin Marinas
2021-11-24 19:20 ` [PATCH 2/3] arm64: Add support for sub-page faults user probing Catalin Marinas
2021-11-24 19:20 ` [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults Catalin Marinas
2021-11-24 20:03 ` Matthew Wilcox
2021-11-24 20:37 ` Catalin Marinas
2021-11-25 22:25 ` Andreas Gruenbacher
2021-11-25 22:42 ` Catalin Marinas
2021-11-26 22:29 ` Andreas Gruenbacher
2021-11-26 22:57 ` Catalin Marinas
2021-11-27 3:52 ` Andreas Gruenbacher
2021-11-27 14:33 ` Catalin Marinas
2021-11-27 12:39 ` Andreas Gruenbacher
2021-11-27 15:21 ` Catalin Marinas
2021-11-27 18:05 ` Andreas Gruenbacher
2021-11-29 12:16 ` Catalin Marinas
2021-11-29 13:33 ` Andreas Gruenbacher
2021-11-29 15:36 ` Catalin Marinas
2021-11-29 18:40 ` Linus Torvalds
2021-11-29 19:31 ` Andreas Gruenbacher
2021-11-29 20:56 ` Catalin Marinas [this message]
2021-11-29 21:53 ` Linus Torvalds
2021-11-29 23:12 ` Catalin Marinas
2021-11-29 13:52 ` Catalin Marinas
2021-11-24 23:00 ` Linus Torvalds
2021-11-25 11:10 ` Catalin Marinas
2021-11-25 18:13 ` Linus Torvalds
2021-11-25 20:43 ` Catalin Marinas
2021-11-25 21:02 ` Matthew Wilcox
2021-11-25 21:29 ` Catalin Marinas
2021-11-25 21:40 ` Andreas Gruenbacher
2021-11-26 16:42 ` David Sterba
2021-11-24 21:36 ` [PATCH 0/3] Avoid live-lock in fault-in+uaccess loops " Andrew Morton
2021-11-24 22:31 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YaU+aDG5pCAba57r@arm.com \
--to=catalin.marinas@arm.com \
--cc=agruenba@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dsterba@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).