All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andres Freund <andres@anarazel.de>
To: Matthew Wilcox <willy@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: io_uring force_nonblock vs POSIX_FADV_WILLNEED
Date: Sun, 2 Feb 2020 23:42:57 -0800	[thread overview]
Message-ID: <20200203074257.nx23pigjtmgbyyyz@alap3.anarazel.de> (raw)
In-Reply-To: <20200203064047.GC8731@bombadil.infradead.org>

Hi Matthew,

On 2020-02-02 22:40:47 -0800, Matthew Wilcox wrote:
> On Sat, Feb 01, 2020 at 01:43:09AM -0800, Andres Freund wrote:
> > As far as I can tell POSIX_FADV_WILLNEED synchronously starts readahead,
> > including page allocation etc, which of course might trigger quite
> > blocking. The fs also quite possibly needs to read metadata.
> > 
> > 
> > Seems like either WILLNEED would have to always be deferred, or
> > force_page_cache_readahead, __do_page_cache_readahead would etc need to
> > be wired up to know not to block. Including returning EAGAIN, despite
> > force_page_cache_readahead and generic_readahead() intentially ignoring
> > return values / errors.
> 
> The first step is going to be letting the readahead code know that it
> should have this behaviour, which is tricky because the code flow looks
> like this:
> 
> io_fadvise
>   vfs_fadvise
>     file->f_op->fadvise()

Yea.


> ... and we'd be breaking brand new ground trying to add a gfp_t to a
> file_operations method.  Which is not to say it couldn't be done, but
> would mean changing filesystems, just so we could pass the gfp
> flags through from the top level to the low level.  It wouldn't be
> too bad; only two filesystems implement an ->fadvise op today.

I was wondering if the right approach could be to pass through a kiocb
instead of gfp_t. There's obviously precedent for that in
file_operations, and then IOCB_NOWAIT could be used to represent the the
intent to not block. It'd be a bit weird in the sense that currently
there'd probably be no callback - but that seems fairly minor. And who
knows, 


> Next possibility, we could add a POSIX_FADV_WILLNEED_ASYNC advice
> flag.

POSIX_FADV_DONTNEED has similar problems to POSIX_FADV_WILLNEED, so it'd
be nice to come up with an API change to vfs_fadvise that'd support
both. Obviously there also could be a POSIX_FADV_DONTNEED_ASYNC, but ...


> Something I already want to see in an entirely different context is
> a flag in the task_struct which says, essentially, "don't block in
> memory allocations" -- ie behave as if __GFP_NOWAIT | __GFP_NOWARN
> is set.  See my proposal here:

I'm a bit out of my depth here: Would __GFP_NOWAIT actually be suitable
to indicate that no blocking IO is to be executed by the FS? E.g. for
metadata? As far as I can tell that's also a problem, not just reclaim
to make space for the to-be-read data.


> I've got my head stuck in the middle of the readahead code right now,
> so this seems like a good time to add this functionality.  Once I'm done
> with finding out who broke my test VM, I'll take a shot at adding
> this.

Cool!

Greetings,

Andres Freund

      reply	other threads:[~2020-02-03 11:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-01  9:43 io_uring force_nonblock vs POSIX_FADV_WILLNEED Andres Freund
2020-02-01 16:22 ` Jens Axboe
2020-02-02  7:14   ` Andres Freund
2020-02-02 16:34     ` Jens Axboe
2020-02-03  6:40 ` Matthew Wilcox
2020-02-03  7:42   ` Andres Freund [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200203074257.nx23pigjtmgbyyyz@alap3.anarazel.de \
    --to=andres@anarazel.de \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.