From: Dave Chinner <david@fromorbit.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>,
"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
Jens Axboe <axboe@kernel.dk>,
linux-block <linux-block@vger.kernel.org>
Subject: Re: [PATCH 3/3] xfs: alignment check bio buffers
Date: Thu, 22 Aug 2019 14:49:05 +1000 [thread overview]
Message-ID: <20190822044905.GU1119@dread.disaster.area> (raw)
In-Reply-To: <CACVXFVN93h7QrFvZNVQQwYZg_n0wGXwn=XZztMJrNbdjzzSpKQ@mail.gmail.com>
On Thu, Aug 22, 2019 at 10:50:02AM +0800, Ming Lei wrote:
> On Thu, Aug 22, 2019 at 8:06 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Wed, Aug 21, 2019 at 06:38:20PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > Add memory buffer alignment validation checks to bios built in XFS
> > > to catch bugs that will result in silent data corruption in block
> > > drivers that cannot handle unaligned memory buffers but don't
> > > validate the incoming buffer alignment is correct.
> > >
> > > Known drivers with these issues are xenblk, brd and pmem.
> > >
> > > Despite there being nothing XFS specific to xfs_bio_add_page(), this
> > > function was created to do the required validation because the block
> > > layer developers that keep telling us that is not possible to
> > > validate buffer alignment in bio_add_page(), and even if it was
> > > possible it would be too much overhead to do at runtime.
> >
> > I really don't think we should life this to XFS, but instead fix it
> > in the block layer. And that is not only because I have a pending
> > series lifting bits you are touching to the block layer..
> >
> > > +int
> > > +xfs_bio_add_page(
> > > + struct bio *bio,
> > > + struct page *page,
> > > + unsigned int len,
> > > + unsigned int offset)
> > > +{
> > > + struct request_queue *q = bio->bi_disk->queue;
> > > + bool same_page = false;
> > > +
> > > + if (WARN_ON_ONCE(!blk_rq_aligned(q, len, offset)))
> > > + return -EIO;
> > > +
> > > + if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
> > > + if (bio_full(bio, len))
> > > + return 0;
> > > + __bio_add_page(bio, page, len, offset);
> > > + }
> > > + return len;
> >
> > I know Jens disagree, but with the amount of bugs we've been hitting
> > thangs to slub (and I'm pretty sure we have a more hiding outside of
> > XFS) I think we need to add the blk_rq_aligned check to bio_add_page.
>
> It isn't correct to blk_rq_aligned() here because 'len' has to be logical block
> size aligned, instead of DMA aligned only.
News to me.
AFAIA, the overall _IO_ that is being built needs to be a multiple
of the logical block size in total size (i.e. bio->bi_iter.size)
because sub sector IO is not allowed. But queue DMA limits are not
defined in sectors - they define the scatter/gather DMA capability
of the hardware, and that's what individual segments (bvecs) need to
align to. That's what blk_rq_aligned() checks here - that the bvec
segment aligns to what the underlying driver(s) requires, not that
the entire IO is sector sized and aligned.
Also, think about multipage bvecs - the pages we are spanning here
are contiguous pages, so this should end up merging them and turning
it into a single multipage bvec whose length is sector size
aligned...
> Also not sure all users may setup bio->bi_disk well before adding page to bio,
> since it is allowed to do that now.
XFS does, so I just don't care about random users of bio_add_page()
in this patch. Somebody else can run the block layer gauntlet to get
these checks moved into generic code and they've already been
rejected twice as unnecessary.
> If slub buffer crosses two pages, block layer may not handle it at all
> even though
> un-aligned 'offset' issue is solved.
A slub buffer crossing two _contiguous_ pages should end up merged
as a multipage bvec. But I'm curious, what does adding multiple
contiguous pages to a bio actually break?
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-08-22 4:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190821083820.11725-1-david@fromorbit.com>
[not found] ` <20190821083820.11725-4-david@fromorbit.com>
2019-08-21 23:29 ` [PATCH 3/3] xfs: alignment check bio buffers Christoph Hellwig
2019-08-22 0:37 ` Dave Chinner
2019-08-22 8:03 ` Christoph Hellwig
2019-08-22 10:17 ` Dave Chinner
2019-08-22 2:50 ` Ming Lei
2019-08-22 4:49 ` Dave Chinner [this message]
2019-08-22 7:23 ` Ming Lei
2019-08-22 8:08 ` Christoph Hellwig
2019-08-22 10:20 ` Ming Lei
2019-08-23 0:14 ` Christoph Hellwig
2019-08-23 1:19 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190822044905.GU1119@dread.disaster.area \
--to=david@fromorbit.com \
--cc=axboe@kernel.dk \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).