linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Christopher Lameter <cl@linux.com>,
	Christoph Hellwig <hch@lst.de>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Ming Lei <tom.leiming@gmail.com>,
	linux-block <linux-block@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
	Dave Chinner <dchinner@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ming Lei <ming.lei@redhat.com>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Tue, 25 Sep 2018 17:49:10 +1000	[thread overview]
Message-ID: <20180925074910.GB31060@dastard> (raw)
In-Reply-To: <1a5b255f-682e-783a-7f99-9d02e39c4af2@kernel.dk>

On Mon, Sep 24, 2018 at 12:09:37PM -0600, Jens Axboe wrote:
> On 9/24/18 12:00 PM, Christopher Lameter wrote:
> > On Mon, 24 Sep 2018, Jens Axboe wrote:
> > 
> >> The situation is making me a little uncomfortable, though. If we export
> >> such a setting, we really should be honoring it...

That's what I said up front, but you replied to this with:

| I think this is all crazy talk. We've never done this, [...]

Now I'm not sure what you are saying we should do....

> > Various subsystems create custom slab arrays with their particular
> > alignment requirement for these allocations.
> 
> Oh yeah, I think the solution is basic enough for XFS, for instance.
> They just have to error on the side of being cautious, by going full
> sector alignment for memory...

How does the filesystem find out about hardware alignment
requirements? Isn't probing through the block device to find out
about the request queue configurations considered a layering
violation?

What if sector alignment is not sufficient?  And how would this work
if we start supporting sector sizes larger than page size? (which the
XFS buffer cache supports just fine, even if nothing else in
Linux does).

But even ignoring sector size > page size, implementing this
requires a bunch of new slab caches, especially for 64k page
machines because XFS supports sector sizes up to 32k.  And every
other filesystem that uses sector sized buffers (e.g. HFS) would
have to do the same thing. Seems somewhat wasteful to require
everyone to implement their own aligned sector slab cache...

Perhaps we should take the filesystem out of this completely - maybe
the block layer could provide a generic "sector heap" and have all
filesystems that use sector sized buffers allocate from it. e.g.
something like

	mem = bdev_alloc_sector_buffer(bdev, sector_size)

That way we don't have to rely on filesystems knowing anything about
the alignment limitations of the devices or assumptions about DMA
to work correctly...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-09-25  7:49 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-19  9:15 block: DMA alignment of IO buffer allocated from slab Ming Lei
2018-09-19  9:41 ` Vitaly Kuznetsov
2018-09-19 10:02   ` Ming Lei
2018-09-19 11:15     ` Vitaly Kuznetsov
2018-09-20  1:28       ` Ming Lei
2018-09-20  3:59         ` Yang Shi
2018-09-20  6:32         ` Christoph Hellwig
2018-09-20  6:31 ` Christoph Hellwig
2018-09-21 13:04   ` Vitaly Kuznetsov
2018-09-21 13:05     ` Christoph Hellwig
2018-09-21 15:00       ` Jens Axboe
2018-09-24 16:06       ` Christopher Lameter
2018-09-24 17:49         ` Jens Axboe
2018-09-24 18:00           ` Christopher Lameter
2018-09-24 18:09             ` Jens Axboe
2018-09-25  7:49               ` Dave Chinner [this message]
2018-09-25 15:44                 ` Jens Axboe
2018-09-25 21:04                 ` Matthew Wilcox
2018-09-23 22:42     ` Ming Lei
2018-09-24  9:46       ` Andrey Ryabinin
2018-09-24 14:19         ` Bart Van Assche
2018-09-24 14:43           ` Andrey Ryabinin
2018-09-24 15:08             ` Bart Van Assche
2018-09-24 15:52               ` Andrey Ryabinin
2018-09-24 15:58                 ` Bart Van Assche
2018-09-24 16:07                   ` Andrey Ryabinin
2018-09-24 16:19                     ` Bart Van Assche
2018-09-24 16:47                       ` Christopher Lameter
2018-09-24 18:57                       ` Matthew Wilcox
2018-09-24 19:56                         ` Bart Van Assche
2018-09-24 20:41                           ` Matthew Wilcox
2018-09-24 20:54                             ` Bart Van Assche
2018-09-24 21:09                               ` Matthew Wilcox
2018-09-25  0:16                         ` Ming Lei
2018-09-25  3:28                           ` Matthew Wilcox
2018-09-25  4:10                             ` Bart Van Assche
2018-09-25  4:44                               ` Matthew Wilcox
2018-09-25  6:55                                 ` Ming Lei
2018-09-24 15:17           ` Christopher Lameter
2018-09-25  0:20             ` Ming Lei
2018-09-20 14:07 ` Bart Van Assche
2018-09-21  1:56 ` Dave Chinner
2018-09-21  7:08   ` Christoph Hellwig
2018-09-21  7:25     ` Ming Lei
2018-09-21 14:59       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180925074910.GB31060@dastard \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux.com \
    --cc=dchinner@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tom.leiming@gmail.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).