linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Dave Chinner <david@fromorbit.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.com>,
	linux-fsdevel@vger.kernel.org, jack@suse.com, hch@infradead.org,
	linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	sagi@grimberg.me, avi@scylladb.com, axboe@kernel.dk,
	linux-api@vger.kernel.org, willy@infradead.org
Subject: Re: [PATCH 5/8] nowait aio: return on congested block device
Date: Mon, 20 Mar 2017 18:33:12 +0100	[thread overview]
Message-ID: <20170320173312.GA10436@quack2.suse.cz> (raw)
In-Reply-To: <045e06ce-ac19-6293-a62b-8ac937f753ac@suse.de>

On Fri 17-03-17 07:23:24, Goldwyn Rodrigues wrote:
> On 03/16/2017 04:31 PM, Dave Chinner wrote:
> > On Wed, Mar 15, 2017 at 04:51:04PM -0500, Goldwyn Rodrigues wrote:
> >> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> >>
> >> A new flag BIO_NOWAIT is introduced to identify bio's
> >> orignating from iocb with IOCB_NOWAIT. This flag indicates
> >> to return immediately if a request cannot be made instead
> >> of retrying.
> > 
> > So this makes a congested block device run the bio IO completion
> > callback with an -EAGAIN error present? Are all the filesystem
> > direct IO submission and completion routines OK with that? i.e. does
> > such a congestion case cause filesystems to temporarily expose stale
> > data to unprivileged users when the IO is requeued in this way?
> > 
> > e.g. filesystem does allocation without blocking, submits bio,
> > device is congested, runs IO completion with error, so nothing
> > written to allocated blocks, write gets queued, so other read
> > comes in while the write is queued, reads data from uninitialised
> > blocks that were allocated during the write....
> > 
> > Seems kinda problematic to me to have a undocumented design
> > constraint (i.e a landmine) where we submit the AIO only to have it
> > error out and then expect the filesystem to do something special and
> > different /without blocking/ on EAGAIN.
> 
> If the filesystems has to perform block allocation, we would return
> -EAGAIN early enough. However, I agree there is a problem, since not all
> filesystems know this. I worked on only three of them.

So Dave has a good point about the missing documentation explaining the
expectation from the filesystem. Other than that the filesystem is
responsible for determining in its direct IO submission routine whether it
can deal with NOWAIT direct IO without blocking (including backing out of
block layer refuses to do the IO) and if not, just return with EAGAIN right
away.

Probably we should also have a feature flag in filesystem_type telling
whether a particular filesystem knows about NOWAIT direct IO at all so that
we don't have to add stub to each and every filesystem supporting direct IO
that returns EAGAIN when NOWAIT is set (OTOH adding that stub won't be that
hard either since there are not *that* many filesystems supporting direct
IO). But one or the other has to be done.

> > Why isn't the congestion check at a higher layer like we do for page
> > cache readahead? i.e. using the bdi*congested() API at the time we
> > are doing all the other filesystem blocking checks.
> > 
> 
> Yes, that may work better. We will have to call bdi_read_congested() on
> a write path. (will have to comment that part of the code). Would it
> encompass all possible waits in the block layer?

Congestion mechanism is not really reliable. The bdi can show the device is
not congested but block layer will still block you waiting for some
resource. And for some of the stuff like tag allocation you cannot really
do a reliable check in advance so I don't think congestion checks are
really a viable alternative.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-03-20 17:33 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-15 21:50 [PATCH 0/8 v3] No wait AIO Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 1/8] nowait aio: Introduce IOCB_RW_FLAG_NOWAIT Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 2/8] nowait aio: Return if cannot get hold of i_rwsem Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 3/8] nowait aio: return if direct write will trigger writeback Goldwyn Rodrigues
2017-03-16 13:08   ` Matthew Wilcox
2017-03-16 13:46     ` Goldwyn Rodrigues
2017-03-16 13:20   ` Matthew Wilcox
     [not found]     ` <20170316132052.GG4033-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>
2017-03-16 13:46       ` Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 4/8] nowait-aio: Introduce IOMAP_NOWAIT Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 5/8] nowait aio: return on congested block device Goldwyn Rodrigues
     [not found]   ` <20170315215107.5628-6-rgoldwyn-l3A5Bk7waGM@public.gmane.org>
2017-03-16 14:33     ` Jens Axboe
2017-03-17  2:03       ` Ming Lei
2017-03-17 12:23       ` Goldwyn Rodrigues
     [not found]       ` <eee4683d-9f44-434f-b97f-b0b24c7b3dab-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2017-03-24 11:32         ` Goldwyn Rodrigues
2017-03-24 14:39           ` Jens Axboe
2017-03-16 21:31     ` Dave Chinner
2017-03-17 12:23       ` Goldwyn Rodrigues
2017-03-20 17:33         ` Jan Kara [this message]
2017-03-15 21:51 ` [PATCH 6/8] nowait aio: ext4 Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 7/8] nowait aio: xfs Goldwyn Rodrigues
2017-03-15 21:51 ` [PATCH 8/8] nowait aio: btrfs Goldwyn Rodrigues
2017-04-03 18:52 [PATCH 0/8 v4] No wait AIO Goldwyn Rodrigues
2017-04-03 18:53 ` [PATCH 5/8] nowait aio: return on congested block device Goldwyn Rodrigues
2017-04-04  6:49   ` Christoph Hellwig
2017-04-14 12:02 [PATCH 0/8 v6] No wait AIO Goldwyn Rodrigues
2017-04-14 12:02 ` [PATCH 5/8] nowait aio: return on congested block device Goldwyn Rodrigues
2017-04-19  6:45   ` Christoph Hellwig
2017-04-19 15:21     ` Goldwyn Rodrigues
2017-04-20 13:43       ` Jan Kara
2017-04-24 21:10     ` Goldwyn Rodrigues
2017-04-25  2:28       ` Jens Axboe
2017-05-09 12:22 [PATCH 0/8 v7] No wait AIO Goldwyn Rodrigues
2017-05-09 12:22 ` [PATCH 5/8] nowait aio: return on congested block device Goldwyn Rodrigues
2017-05-11  7:44   ` Christoph Hellwig
     [not found]     ` <20170511074451.GD15626-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2017-05-11 18:16       ` Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170320173312.GA10436@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=avi@scylladb.com \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.com \
    --cc=rgoldwyn@suse.de \
    --cc=sagi@grimberg.me \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).