All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@fb.com>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	axboe@kernel.dk, Kernel Team <Kernel-team@fb.com>,
	bvanassche@acm.org, damien.lemoal@opensource.wdc.com
Subject: Re: [PATCHv2 3/3] block: relax direct io memory alignment
Date: Thu, 19 May 2022 08:08:50 -0600	[thread overview]
Message-ID: <YoZPcqDpwSTn/csn@kbusch-mbp> (raw)
In-Reply-To: <20220519073811.GE22301@lst.de>

On Thu, May 19, 2022 at 09:38:11AM +0200, Christoph Hellwig wrote:
> > @@ -1207,6 +1207,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> >  {
> >  	unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
> >  	unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
> > +	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
> >  	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
> >  	struct page **pages = (struct page **)bv;
> >  	bool same_page = false;
> > @@ -1223,6 +1224,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> >  	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
> >  
> >  	size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
> > +	if (size > 0)
> > +		size = ALIGN_DOWN(size, queue_logical_block_size(q));
> 
> So if we do get a size that is not logical block size alignment here,
> we reduce it to the block size aligned one below.  Why do we do that?

There are two possibilities:

In the first case, the number of pages in this iteration exceeds bi_max_vecs.
Rounding down completes the bio with a block aligned size, and the remainder
will be picked up for the next bio, or possibly even the current bio if the
pages are sufficiently physically contiguous.

The other case is a bad iov. If we're doing __blkdev_direct_IO(), it will error
out immediately if the rounded size is 0, or the next iteration when the next
size is rounded to 0. If we're doing the __blkdev_direct_IO_simple(), it will
error out when it sees the iov hasn't advanced to the end.

And ... I just noticed I missed the size check __blkdev_direct_IO_async().
 
> > +	if ((pos | iov_iter_count(iter)) & (bdev_logical_block_size(bdev) - 1))
> > +		return -EINVAL;
> > +	if (iov_iter_alignment(iter) & bdev_dma_alignment(bdev))
> >  		return -EINVAL;
> 
> Can we have a little inline helper for these checks instead of
> duplicating them three times?

Absolutely.

> > diff --git a/fs/direct-io.c b/fs/direct-io.c
> > index 840752006f60..64cc176be60c 100644
> > --- a/fs/direct-io.c
> > +++ b/fs/direct-io.c
> > @@ -1131,7 +1131,7 @@ ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> >  	struct dio_submit sdio = { 0, };
> >  	struct buffer_head map_bh = { 0, };
> >  	struct blk_plug plug;
> > -	unsigned long align = offset | iov_iter_alignment(iter);
> > +	unsigned long align = iov_iter_alignment(iter);
> 
> I'd much prefer to not just relax this for random file systems,
> and especially not the legacy direct I/O code.  I think we can eventually
> do iomap, but only after an audit and test of each file system, which
> might require a new IOMAP_DIO_* flag at least initially.

I did some testing with xfs, but I can certainly run more a lot more tests. I
do think filesystem support for this capability is important, so I hope we
eventually get there.

  reply	other threads:[~2022-05-19 14:09 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18 17:11 [PATCHv2 0/3] direct io alignment relax Keith Busch
2022-05-18 17:11 ` [PATCHv2 1/3] block/bio: remove duplicate append pages code Keith Busch
2022-05-18 20:21   ` Chaitanya Kulkarni
2022-05-19  4:28   ` Bart Van Assche
2022-05-19  7:32   ` Christoph Hellwig
2022-05-19 14:19     ` Keith Busch
2022-05-18 17:11 ` [PATCHv2 2/3] block: export dma_alignment attribute Keith Busch
2022-05-18 20:22   ` Chaitanya Kulkarni
2022-05-19  4:30   ` Bart Van Assche
2022-05-19  7:33   ` Christoph Hellwig
2022-05-18 17:11 ` [PATCHv2 3/3] block: relax direct io memory alignment Keith Busch
2022-05-19  0:14   ` Eric Biggers
2022-05-19  1:00     ` Keith Busch
2022-05-19  1:53       ` Eric Biggers
2022-05-19  1:59         ` Keith Busch
2022-05-19  2:08           ` Eric Biggers
2022-05-19  2:25             ` Keith Busch
2022-05-19  3:27               ` Eric Biggers
2022-05-19  4:40                 ` Bart Van Assche
2022-05-19  4:56                 ` Keith Busch
2022-05-19  6:45                   ` Damien Le Moal
2022-05-19 17:19                     ` Eric Biggers
2022-05-20  3:41                       ` Damien Le Moal
2022-05-19  7:41                   ` Christoph Hellwig
2022-05-19 16:35                     ` Keith Busch
2022-05-20  6:07                       ` Christoph Hellwig
2022-05-19 17:01                   ` Keith Busch
2022-05-19 17:27                     ` Eric Biggers
2022-05-19 17:43                       ` Keith Busch
2022-05-19  7:39       ` Christoph Hellwig
2022-05-19 22:31         ` Keith Busch
2022-05-19  7:38   ` Christoph Hellwig
2022-05-19 14:08     ` Keith Busch [this message]
2022-05-20  6:10       ` Christoph Hellwig
2022-05-18 22:45 ` [PATCHv2 0/3] direct io alignment relax Jens Axboe
2022-05-19  7:42   ` Christoph Hellwig
2022-05-19 12:46     ` Jens Axboe
2022-05-18 23:26 ` Eric Biggers
2022-05-19  0:51   ` Keith Busch
2022-05-19  1:02     ` Chaitanya Kulkarni
2022-05-19  2:02       ` Eric Biggers
2022-05-19  7:43         ` hch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoZPcqDpwSTn/csn@kbusch-mbp \
    --to=kbusch@kernel.org \
    --cc=Kernel-team@fb.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=hch@lst.de \
    --cc=kbusch@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.