All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ric Wheeler <ricwheeler@gmail.com>
Cc: Eric Sandeen <sandeen@sandeen.net>,
	Ilya Dryomov <idryomov@gmail.com>,
	xfs <linux-xfs@vger.kernel.org>, Mark Nelson <mnelson@redhat.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>
Subject: Re: [PATCH] mkfs.xfs: don't go into multidisk mode if there is only one stripe
Date: Fri, 30 Nov 2018 13:25:10 +1100	[thread overview]
Message-ID: <20181130022510.GW6311@dastard> (raw)
In-Reply-To: <39031e68-3936-b5e1-bcb6-6fdecc5988c1@gmail.com>

On Thu, Nov 29, 2018 at 06:53:14PM -0500, Ric Wheeler wrote:
> On 11/29/18 4:48 PM, Dave Chinner wrote:
> >On Thu, Nov 29, 2018 at 08:53:39AM -0500, Ric Wheeler wrote:
> >>On 10/6/18 8:14 PM, Eric Sandeen wrote:
> >>>On 10/6/18 6:20 PM, Dave Chinner wrote:
> >>>>>Can you give an example of a use case that would be negatively affected
> >>>>>if this heuristic was switched from "sunit" to "sunit < swidth"?
> >>>>Any time you only know a single alignment characteristic of the
> >>>>underlying multi-disk storage. e.g. hardware RAID0/5/6 that sets
> >>>>iomin = ioopt, multi-level RAID constructs where only the largest
> >>>>alignment requirement is exposed, RAID1 devices exposing their chunk
> >>>>size, remote replication chunk alignment (because remote rep. is
> >>>>slow and so we need more concurrency to keep the pipeline full),
> >>>>etc.
> >>>So the tl;dr here is "given any iomin > 512, we should infer low seek
> >>>latency and parallelism and adjust geometry accordingly?"
> >>>
> >>>-Eric
> >>Chiming in late here, but I do think that every decade or two (no
> >>disrespect to xfs!), it is worth having a second look at how the
> >>storage has changed under us.
> >>
> >>The workload that has lots of file systems pounding on a shared
> >>device for example is one way to lay out container storage.
> >The problem is that defaults can't cater for every use case.
> >And in this case, we've got nothing to tell us that this is
> >aggregated/shared storage rather than "the fileystem owns the
> >entire device".
> >
> >>No argument about documenting how to fix this with command line
> >>tweaks for now, but maybe this would be a good topic for the next
> >>LSF/MM shared track of file & storage people to debate?
> >Doubt it - this is really only an XFS problem at this point.
> >
> >i.e. if we can't infer what the user wants from existing
> >information, then I don't see how the storage is going to be able to
> >tell us anything different, either.  i.e. somewhere in the stack the
> >user is going to have to tell the block device that this is
> >aggregated storage.
> >
> >But even then, if it's aggregated solid state storage, we still want
> >to make use of the concurency on increased AG count because there is
> >no seek penalty like spinning drives end up with. Or if the
> >aggregated storage is thinly provisioned, the AG count of filesystem
> >just doesn't matter because the IO is going to be massively
> >randomised (i.e take random seek penalties) by the thinp layout.
> >
> >So there's really no good way of "guessing" whether aggregated
> >storage should or shouldn't use elevated AG counts even if the
> >storage says "this is aggregated storage". The user still has to
> >give us some kind of explict hint about how the filesystem should
> >be configured.
> >
> >What we need is for a solid, reliable detection hueristic to be
> >suggested by the people that need this functionality before there's
> >anything we can talk about.
> 
> I think that is exactly the kind of discussion that the shared
> file/storage track is good for.

Yes, but why on earth do we need to wait 6 months to have that
conversation. Start it now...

> Other file systems also need to
> accommodate/probe behind the fictitious visible storage device
> layer... Specifically, is there something we can add per block
> device to help here? Number of independent devices

That's how mkfs.xfs used to do stripe unit/stripe width calculations
automatically on MD devices back in the 2000s. We got rid of that
for more generaly applicable configuration information such as
minimum/optimal IO sizes so we could expose equivalent alignment
information from lots of different types of storage device....

> or a map of
> those regions?

Not sure what this means or how we'd use it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-11-30 13:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-04 17:58 [PATCH] mkfs.xfs: don't go into multidisk mode if there is only one stripe Ilya Dryomov
2018-10-04 18:33 ` Eric Sandeen
2018-10-04 18:56   ` Ilya Dryomov
2018-10-04 22:29   ` Dave Chinner
2018-10-05 11:27     ` Ilya Dryomov
2018-10-05 13:51       ` Eric Sandeen
2018-10-05 23:27         ` Dave Chinner
2018-10-06 12:17           ` Ilya Dryomov
2018-10-06 23:20             ` Dave Chinner
2018-10-07  0:14               ` Eric Sandeen
2018-11-29 13:53                 ` Ric Wheeler
2018-11-29 21:48                   ` Dave Chinner
2018-11-29 23:53                     ` Ric Wheeler
2018-11-30  2:25                       ` Dave Chinner [this message]
2018-11-30 18:00                         ` block layer API for file system creation - when to use multidisk mode Ric Wheeler
2018-11-30 18:00                           ` Ric Wheeler
2018-11-30 18:05                           ` Mark Nelson
2018-11-30 18:05                             ` Mark Nelson
2018-12-01  4:35                           ` Dave Chinner
2018-12-01  4:35                             ` Dave Chinner
2018-12-01 20:52                             ` Ric Wheeler
2018-12-01 20:52                               ` Ric Wheeler
2018-10-07 13:54               ` [PATCH] mkfs.xfs: don't go into multidisk mode if there is only one stripe Ilya Dryomov
2018-10-10  0:28                 ` Dave Chinner
2018-10-05 14:50       ` Mike Snitzer
2018-10-05 14:55         ` Eric Sandeen
2018-10-05 17:21           ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181130022510.GW6311@dastard \
    --to=david@fromorbit.com \
    --cc=idryomov@gmail.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mnelson@redhat.com \
    --cc=ricwheeler@gmail.com \
    --cc=sandeen@redhat.com \
    --cc=sandeen@sandeen.net \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.