linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	linux-xfs@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Dave Chinner <dchinner@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	Aaron Lu <aaron.lu@intel.com>, Christopher Lameter <cl@linux.com>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	linux-mm@kvack.org, linux-block@vger.kernel.org
Subject: Re: [PATCH] xfs: allocate sector sized IO buffer via page_frag_alloc
Date: Tue, 26 Feb 2019 14:02:14 +1100	[thread overview]
Message-ID: <20190226030214.GI23020@dastard> (raw)
In-Reply-To: <20190226022249.GA17747@ming.t460p>

On Tue, Feb 26, 2019 at 10:22:50AM +0800, Ming Lei wrote:
> On Tue, Feb 26, 2019 at 07:26:30AM +1100, Dave Chinner wrote:
> > On Mon, Feb 25, 2019 at 02:15:59PM +0100, Vlastimil Babka wrote:
> > > On 2/25/19 5:36 AM, Dave Chinner wrote:
> > > > On Mon, Feb 25, 2019 at 12:09:04PM +0800, Ming Lei wrote:
> > > >> XFS uses kmalloc() to allocate sector sized IO buffer.
> > > > ....
> > > >> Use page_frag_alloc() to allocate the sector sized buffer, then the
> > > >> above issue can be fixed because offset_in_page of allocated buffer
> > > >> is always sector aligned.
> > > > 
> > > > Didn't we already reject this approach because page frags cannot be
> > > > reused and that pages allocated to the frag pool are pinned in
> > > > memory until all fragments allocated on the page have been freed?
> > > 
> > > I don't know if you did, but it's certainly true., Also I don't think
> > > there's any specified alignment guarantee for page_frag_alloc().
> > 
> > We did, and the alignment guarantee would have come from all
> > fragments having an aligned size.
> > 
> > > What about kmem_cache_create() with align parameter? That *should* be
> > > guaranteed regardless of whatever debugging is enabled - if not, I would
> > > consider it a bug.
> > 
> > Yup, that's pretty much what was decided. The sticking point was
> > whether is should be block layer infrastructure (because the actual
> > memory buffer alignment is a block/device driver requirement not
> > visible to the filesystem) or whether "sector size alignement is
> > good enough for everyone".
> 
> OK, looks I miss the long life time of meta data caching, then let's
> discuss the slab approach.
> 
> Looks one single slab cache doesn't work, given the size may be 512 * N
> (1 <= N < PAGE_SIZE/512), that is basically what I posted the first
> time.
> 
> https://marc.info/?t=153986884900007&r=1&w=2
> https://marc.info/?t=153986885100001&r=1&w=2
> 
> Or what is the exact size of sub-page IO in xfs most of time? For

Determined by mkfs parameters. Any power of 2 between 512 bytes and
64kB needs to be supported. e.g:

# mkfs.xfs -s size=512 -b size=1k -i size=2k -n size=8k ....

will have metadata that is sector sized (512 bytes), filesystem
block sized (1k), directory block sized (8k) and inode cluster sized
(32k), and will use all of them in large quantities.

> example, if 99% times falls in 512 byte allocation, maybe it is enough
> to just maintain one 512byte slab.

It is not. On a 64k page size machine, we use sub page slabs for
metadata blocks of 2^N bytes where 9 <= N <= 15..

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-02-26  3:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-25  4:09 [PATCH] xfs: allocate sector sized IO buffer via page_frag_alloc Ming Lei
2019-02-25  4:36 ` Dave Chinner
2019-02-25  8:46   ` Ming Lei
2019-02-25 10:03     ` Ming Lei
2019-02-25 20:11     ` Dave Chinner
2019-02-25 13:15   ` Vlastimil Babka
2019-02-25 20:26     ` Dave Chinner
2019-02-26  2:22       ` Ming Lei
2019-02-26  3:02         ` Dave Chinner [this message]
2019-02-26  3:27           ` Matthew Wilcox
2019-02-26  4:58             ` Dave Chinner
2019-02-26  9:33               ` Ming Lei
2019-02-26 10:06                 ` Vlastimil Babka
2019-02-26 11:12                   ` Ming Lei
2019-02-26 12:12                     ` Matthew Wilcox
2019-02-26 12:35                       ` Ming Lei
2019-02-26 13:02                         ` Matthew Wilcox
2019-02-26 13:42                           ` Ming Lei
2019-02-26 14:04                             ` Matthew Wilcox
2019-02-26 16:14                               ` Darrick J. Wong
2019-02-26 16:19                                 ` Matthew Wilcox
2019-02-27  1:41                                   ` Ming Lei
2019-02-27  7:07                                   ` Vlastimil Babka
2019-03-08  8:18                                     ` Christoph Hellwig
2019-02-27 21:38                                 ` Dave Chinner
2019-02-26 15:30                             ` Christopher Lameter
2019-02-26 20:45                 ` Dave Chinner
2019-02-27  1:50                   ` Ming Lei
2019-02-27  3:41                     ` Dave Chinner
2019-02-26 15:20     ` Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190226030214.GI23020@dastard \
    --to=david@fromorbit.com \
    --cc=aaron.lu@intel.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux.com \
    --cc=darrick.wong@oracle.com \
    --cc=dchinner@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).