All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] xfs: add kmem_alloc_io()
Date: Fri, 23 Aug 2019 08:10:02 -0400	[thread overview]
Message-ID: <20190823121002.GA53137@bfoster> (raw)
In-Reply-To: <20190822223959.GC1119@dread.disaster.area>

On Fri, Aug 23, 2019 at 08:39:59AM +1000, Dave Chinner wrote:
> On Thu, Aug 22, 2019 at 09:40:17AM -0400, Brian Foster wrote:
> > On Thu, Aug 22, 2019 at 07:14:52AM +1000, Dave Chinner wrote:
> > > On Wed, Aug 21, 2019 at 09:35:33AM -0400, Brian Foster wrote:
> > > > On Wed, Aug 21, 2019 at 06:38:19PM +1000, Dave Chinner wrote:
> > > > kmem_alloc_io() interface) to skip further kmem_alloc() calls from this
> > > > path once we see one unaligned allocation. That assumes this behavior is
> > > > tied to functionality that isn't dynamically configured at runtime, of
> > > > course.
> > > 
> > > vmalloc() has a _lot_ more overhead than kmalloc (especially when
> > > vmalloc has to do multiple allocations itself to set up page table
> > > entries) so there is still an overall gain to be had by trying
> > > kmalloc even if 50% of allocations are unaligned.
> > > 
> > 
> > I had the impression that this unaligned allocation behavior is tied to
> > enablement of debug options that otherwise aren't enabled/disabled
> > dynamically. Hence, the unaligned allocation behavior is persistent for
> > a particular mount and repeated attempts are pointless once we see at
> > least one such result. Is that not the case?
> 
> The alignment for a given slab is likely to be persistent, but it
> will be different for different sized slabs. e.g. I saw 128 offsets
> for 512 slabs, and 1024 byte offsets for 4k slabs. The 1024 byte
> offsets worked just fine (because multiple of 512 bytes!) but the
> 128 byte ones didn't.
> 

Interesting. I wasn't aware offsets could be large enough to maintain
alignment requirements.

> Hence it's not a black and white "everythign is unaligned and
> unsupportable" situation, nor is the alignment necessarily an issue
> for the underlying driver. e.g. most scsi and nvme handle 8
> byte alignment of buffers, and if the buffer is not aligned they
> bounce it (detected via the same blk_rq_alignment() check I added)
> and can still do the IO anyway. So a large amount of the IO stack
> just doesn't care about user buffers being unaligned....
> 
> > Again, I don't think performance is a huge deal so long as testing shows
> > that an fs is still usable with XFS running this kind of allocation
> > pattern.
> 
> It's added 10 minutes to the runtime of a full auto run with KASAN
> enabled on pmem. To put that in context, the entire run took:
> 
> real    461m36.831s
> user    44m31.779s
> sys     708m37.467s
> 
> More than 7.5 hours to complete, so ten minutes here or there is
> noise.
> 

Agreed. Thanks for the data.

> > In thinking further about it, aren't we essentially bypassing
> > these tools for associated allocations if they don't offer similar
> > functionality for vmalloc allocations?
> 
> kasan uses guard pages around vmalloc allocations to detect out of
> bound accesses. It still tracks the page allocations, etc, so we
> still get use after free tracking, etc. i.e. AFAICT we don't
> actually lose any debugging functonality by using vmalloc here.
> 

Ok.

> > It might be worth 1.) noting that
> > as a consequence of this change in the commit log and 2.) having a
> > oneshot warning somewhere when we initially hit this problem so somebody
> > actually using one of these tools realizes that enabling it actually
> > changes allocation behavior. For example:
> > 
> > XFS ...: WARNING: Unaligned I/O memory allocation. VM debug enabled?
> > Disabling slab allocations for I/O.
> > 
> > ... or alternatively just add a WARN_ON_ONCE() or something with a
> > similar comment in the code.
> 
> Well, the WARN_ON_ONCE is in xfs_bio_add_page() when it detects an
> invalid alignment. So we'll get this warning on production systems
> as well as debug/test systems. I think that's the important case to
> catch, because misalignment will result in silent data corruption...
> 

That's in the subsequent patch that I thought was being dropped (or more
accurately, deferred until Christoph has a chance to try and get it
fixed in the block layer..)..?

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2019-08-23 12:10 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-21  8:38 [PATCH 0/3] xfs: avoid IO issues unaligned memory allocation Dave Chinner
2019-08-21  8:38 ` [PATCH 1/3] xfs: add kmem allocation trace points Dave Chinner
2019-08-21 13:34   ` Brian Foster
2019-08-21 23:20   ` Christoph Hellwig
2019-08-21  8:38 ` [PATCH 2/3] xfs: add kmem_alloc_io() Dave Chinner
2019-08-21 13:35   ` Brian Foster
2019-08-21 15:08     ` Darrick J. Wong
2019-08-21 21:24       ` Dave Chinner
2019-08-21 15:23     ` Eric Sandeen
2019-08-21 21:14     ` Dave Chinner
2019-08-22 13:40       ` Brian Foster
2019-08-22 22:39         ` Dave Chinner
2019-08-23 12:10           ` Brian Foster [this message]
2019-08-21 23:24   ` Christoph Hellwig
2019-08-22  0:31     ` Dave Chinner
2019-08-22  7:59       ` Christoph Hellwig
2019-08-22  8:51         ` Peter Zijlstra
2019-08-22  9:10           ` Peter Zijlstra
2019-08-22 10:14             ` Dave Chinner
2019-08-22 11:14               ` Vlastimil Babka
2019-08-22 12:07                 ` Dave Chinner
2019-08-22 12:19                   ` Vlastimil Babka
2019-08-22 13:17                     ` Dave Chinner
2019-08-22 14:26                       ` Vlastimil Babka
2019-08-26 12:21                         ` Michal Hocko
2019-08-21  8:38 ` [PATCH 3/3] xfs: alignment check bio buffers Dave Chinner
2019-08-21 13:39   ` Brian Foster
2019-08-21 21:39     ` Dave Chinner
2019-08-22 13:47       ` Brian Foster
2019-08-22 23:03         ` Dave Chinner
2019-08-23 12:33           ` Brian Foster
2019-08-21 23:30     ` Christoph Hellwig
2019-08-22  0:44       ` Dave Chinner
2019-08-21 23:29   ` Christoph Hellwig
2019-08-22  0:37     ` Dave Chinner
2019-08-22  8:03       ` Christoph Hellwig
2019-08-22 10:17         ` Dave Chinner
2019-08-22  2:50     ` Ming Lei
2019-08-22  4:49       ` Dave Chinner
2019-08-22  7:23         ` Ming Lei
2019-08-22  8:08         ` Christoph Hellwig
2019-08-22 10:20           ` Ming Lei
2019-08-23  0:14             ` Christoph Hellwig
2019-08-23  1:19               ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190823121002.GA53137@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.