All of lore.kernel.org
 help / color / mirror / Atom feed
* Mounting xfs filesystem takes long time
@ 2018-06-19 12:27 swadmin - levigo.de
  2018-06-19 16:00 ` Emmanuel Florac
  2018-06-19 16:18 ` Darrick J. Wong
  0 siblings, 2 replies; 14+ messages in thread
From: swadmin - levigo.de @ 2018-06-19 12:27 UTC (permalink / raw)
  To: linux-xfs

Hi @all
I have a problem with mounting a large XFS filesystem which takes about
8-10 minutes.



:~# df -h /graylog_data
Filesystem                       Size  Used Avail Use% Mounted on
/dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data

----

:~# xfs_info /dev/mapper/vgdata-graylog_data
meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
agsize=65792 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=2684612608, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

----

:~# strace -f mount /dev/mapper/vgdata-graylog_data /graylog_data >
/root/mount 2>&1

[...]

stat("/sbin/mount.xfs", 0x7ffe36bcfe80) = -1 ENOENT (No such file or
directory)
stat("/sbin/fs.d/mount.xfs", 0x7ffe36bcfe80) = -1 ENOENT (No such file
or directory)
stat("/sbin/fs/mount.xfs", 0x7ffe36bcfe80) = -1 ENOENT (No such file or
directory)
getuid()                                = 0
geteuid()                               = 0
getgid()                                = 0
getegid()                               = 0
prctl(PR_GET_DUMPABLE)                  = 1
stat("/run", {st_mode=S_IFDIR|0755, st_size=720, ...}) = 0
lstat("/run/mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
open("/run/mount/utab", O_RDWR|O_CREAT|O_CLOEXEC, 0644) = 3
close(3)                                = 0
mount("/dev/mapper/vgdata-graylog_data", "/graylog_data", "xfs",
MS_MGC_VAL, NULL


[here is where the 10 minutes happen]


) = 0
--- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

[Now the mount is ok]


can anyone help me or know a kind of these problem. What else to try out?

BR
Sascha

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-19 12:27 Mounting xfs filesystem takes long time swadmin - levigo.de
@ 2018-06-19 16:00 ` Emmanuel Florac
  2018-06-19 16:18 ` Darrick J. Wong
  1 sibling, 0 replies; 14+ messages in thread
From: Emmanuel Florac @ 2018-06-19 16:00 UTC (permalink / raw)
  To: swadmin - levigo.de; +Cc: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 937 bytes --]

Le Tue, 19 Jun 2018 14:27:29 +0200
"swadmin - levigo.de" <swadmin@levigo.de> écrivait:

> can anyone help me or know a kind of these problem. What else to try
> out?
> 

Maybe you have very high IO pressure on the underlying devices? Try
this:

in a window, run "iostat -x 2" Look at the numbers for the device
you'll be using. Particularly look at the "% use" and "IO wait" values.
They should be at zero, or close.

Now in another window mount the volume. See what happens in the output
of iostat. Do you see any change? Aren't some devices receiving an high
load? is the IO wait value up?

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

[-- Attachment #2: Signature digitale OpenPGP --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-19 12:27 Mounting xfs filesystem takes long time swadmin - levigo.de
  2018-06-19 16:00 ` Emmanuel Florac
@ 2018-06-19 16:18 ` Darrick J. Wong
  2018-06-19 19:21   ` Eric Sandeen
  1 sibling, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2018-06-19 16:18 UTC (permalink / raw)
  To: swadmin - levigo.de; +Cc: linux-xfs

On Tue, Jun 19, 2018 at 02:27:29PM +0200, swadmin - levigo.de wrote:
> Hi @all
> I have a problem with mounting a large XFS filesystem which takes about
> 8-10 minutes.
> 
> 
> 
> :~# df -h /graylog_data
> Filesystem                       Size  Used Avail Use% Mounted on
> /dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data
> 
> ----
> 
> :~# xfs_info /dev/mapper/vgdata-graylog_data
> meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
> agsize=65792 blks

41,000 AGs is a lot of metadata to load.  Did someone growfs a 1G fs
into a 11T fs?

--D

>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=1        finobt=1 spinodes=0 rmapbt=0
>          =                       reflink=0
> data     =                       bsize=4096   blocks=2684612608, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> log      =internal               bsize=4096   blocks=2560, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> ----
> 
> :~# strace -f mount /dev/mapper/vgdata-graylog_data /graylog_data >
> /root/mount 2>&1
> 
> [...]
> 
> stat("/sbin/mount.xfs", 0x7ffe36bcfe80) = -1 ENOENT (No such file or
> directory)
> stat("/sbin/fs.d/mount.xfs", 0x7ffe36bcfe80) = -1 ENOENT (No such file
> or directory)
> stat("/sbin/fs/mount.xfs", 0x7ffe36bcfe80) = -1 ENOENT (No such file or
> directory)
> getuid()                                = 0
> geteuid()                               = 0
> getgid()                                = 0
> getegid()                               = 0
> prctl(PR_GET_DUMPABLE)                  = 1
> stat("/run", {st_mode=S_IFDIR|0755, st_size=720, ...}) = 0
> lstat("/run/mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> open("/run/mount/utab", O_RDWR|O_CREAT|O_CLOEXEC, 0644) = 3
> close(3)                                = 0
> mount("/dev/mapper/vgdata-graylog_data", "/graylog_data", "xfs",
> MS_MGC_VAL, NULL
> 
> 
> [here is where the 10 minutes happen]
> 
> 
> ) = 0
> --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
> close(1)                                = 0
> close(2)                                = 0
> exit_group(0)                           = ?
> +++ exited with 0 +++
> 
> [Now the mount is ok]
> 
> 
> can anyone help me or know a kind of these problem. What else to try out?
> 
> BR
> Sascha
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-19 16:18 ` Darrick J. Wong
@ 2018-06-19 19:21   ` Eric Sandeen
  2018-06-21 19:15     ` Luis R. Rodriguez
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2018-06-19 19:21 UTC (permalink / raw)
  To: Darrick J. Wong, swadmin - levigo.de; +Cc: linux-xfs

On 6/19/18 11:18 AM, Darrick J. Wong wrote:
> On Tue, Jun 19, 2018 at 02:27:29PM +0200, swadmin - levigo.de wrote:
>> Hi @all
>> I have a problem with mounting a large XFS filesystem which takes about
>> 8-10 minutes.
>>
>>
>>
>> :~# df -h /graylog_data
>> Filesystem                       Size  Used Avail Use% Mounted on
>> /dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data
>>
>> ----
>>
>> :~# xfs_info /dev/mapper/vgdata-graylog_data
>> meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
>> agsize=65792 blks
> 
> 41,000 AGs is a lot of metadata to load.  Did someone growfs a 1G fs
> into a 11T fs?

<answer: yes, they did>

Let me state that a little more clearly: this is a badly mis-administered
filesystem; 40805 x 256MB AGs is nearly unusable, as you've seen.

If at all possible I would start over with a rationally-created filesystem
and migrate the data.

-Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-19 19:21   ` Eric Sandeen
@ 2018-06-21 19:15     ` Luis R. Rodriguez
  2018-06-21 19:19       ` Eric Sandeen
  0 siblings, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2018-06-21 19:15 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Darrick J. Wong, swadmin - levigo.de, linux-xfs

On Tue, Jun 19, 2018 at 02:21:15PM -0500, Eric Sandeen wrote:
> On 6/19/18 11:18 AM, Darrick J. Wong wrote:
> > On Tue, Jun 19, 2018 at 02:27:29PM +0200, swadmin - levigo.de wrote:
> >> Hi @all
> >> I have a problem with mounting a large XFS filesystem which takes about
> >> 8-10 minutes.
> >>
> >>
> >>
> >> :~# df -h /graylog_data
> >> Filesystem                       Size  Used Avail Use% Mounted on
> >> /dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data
> >>
> >> ----
> >>
> >> :~# xfs_info /dev/mapper/vgdata-graylog_data
> >> meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
> >> agsize=65792 blks
> > 
> > 41,000 AGs is a lot of metadata to load.  Did someone growfs a 1G fs
> > into a 11T fs?
> 
> <answer: yes, they did>
> 
> Let me state that a little more clearly: this is a badly mis-administered
> filesystem; 40805 x 256MB AGs is nearly unusable, as you've seen.
> 
> If at all possible I would start over with a rationally-created filesystem
> and migrate the data.

Considering *a lot* of folks may typically follow the above "trap", wouldn't it
be wise for userspace to complain or warn when the user may want to do
something stupid like this? Otherwise I cannot see how we could possibly
conceive that this is badly administered filesystem.

  Luis

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-21 19:15     ` Luis R. Rodriguez
@ 2018-06-21 19:19       ` Eric Sandeen
  2018-06-21 21:50         ` Chris Murphy
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2018-06-21 19:19 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: Darrick J. Wong, swadmin - levigo.de, linux-xfs



On 6/21/18 2:15 PM, Luis R. Rodriguez wrote:
> On Tue, Jun 19, 2018 at 02:21:15PM -0500, Eric Sandeen wrote:
>> On 6/19/18 11:18 AM, Darrick J. Wong wrote:
>>> On Tue, Jun 19, 2018 at 02:27:29PM +0200, swadmin - levigo.de wrote:
>>>> Hi @all
>>>> I have a problem with mounting a large XFS filesystem which takes about
>>>> 8-10 minutes.
>>>>
>>>>
>>>>
>>>> :~# df -h /graylog_data
>>>> Filesystem                       Size  Used Avail Use% Mounted on
>>>> /dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data
>>>>
>>>> ----
>>>>
>>>> :~# xfs_info /dev/mapper/vgdata-graylog_data
>>>> meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
>>>> agsize=65792 blks
>>>
>>> 41,000 AGs is a lot of metadata to load.  Did someone growfs a 1G fs
>>> into a 11T fs?
>>
>> <answer: yes, they did>
>>
>> Let me state that a little more clearly: this is a badly mis-administered
>> filesystem; 40805 x 256MB AGs is nearly unusable, as you've seen.
>>
>> If at all possible I would start over with a rationally-created filesystem
>> and migrate the data.
> 
> Considering *a lot* of folks may typically follow the above "trap", wouldn't it
> be wise for userspace to complain or warn when the user may want to do
> something stupid like this? Otherwise I cannot see how we could possibly
> conceive that this is badly administered filesystem.

Fair point, though I'm not sure where such a warning would go.  growfs?
I'm not a big fan of the "you asked for something unusual, continue [y/N]?"
type prompts.

To people who know how xfs is laid out it's "obvious" but it's not fair to
assume every admin knows this, you're right.  So calling it mis-administered
was a bit harsh.

-Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-21 19:19       ` Eric Sandeen
@ 2018-06-21 21:50         ` Chris Murphy
  2018-06-21 22:19           ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Murphy @ 2018-06-21 21:50 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Luis R. Rodriguez, Darrick J. Wong, swadmin - levigo.de, xfs list

On Thu, Jun 21, 2018 at 1:19 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>
>
> On 6/21/18 2:15 PM, Luis R. Rodriguez wrote:
>> On Tue, Jun 19, 2018 at 02:21:15PM -0500, Eric Sandeen wrote:
>>> On 6/19/18 11:18 AM, Darrick J. Wong wrote:
>>>> On Tue, Jun 19, 2018 at 02:27:29PM +0200, swadmin - levigo.de wrote:
>>>>> Hi @all
>>>>> I have a problem with mounting a large XFS filesystem which takes about
>>>>> 8-10 minutes.
>>>>>
>>>>>
>>>>>
>>>>> :~# df -h /graylog_data
>>>>> Filesystem                       Size  Used Avail Use% Mounted on
>>>>> /dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data
>>>>>
>>>>> ----
>>>>>
>>>>> :~# xfs_info /dev/mapper/vgdata-graylog_data
>>>>> meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
>>>>> agsize=65792 blks
>>>>
>>>> 41,000 AGs is a lot of metadata to load.  Did someone growfs a 1G fs
>>>> into a 11T fs?
>>>
>>> <answer: yes, they did>
>>>
>>> Let me state that a little more clearly: this is a badly mis-administered
>>> filesystem; 40805 x 256MB AGs is nearly unusable, as you've seen.
>>>
>>> If at all possible I would start over with a rationally-created filesystem
>>> and migrate the data.
>>
>> Considering *a lot* of folks may typically follow the above "trap", wouldn't it
>> be wise for userspace to complain or warn when the user may want to do
>> something stupid like this? Otherwise I cannot see how we could possibly
>> conceive that this is badly administered filesystem.
>
> Fair point, though I'm not sure where such a warning would go.  growfs?
> I'm not a big fan of the "you asked for something unusual, continue [y/N]?"
> type prompts.
>
> To people who know how xfs is laid out it's "obvious" but it's not fair to
> assume every admin knows this, you're right.  So calling it mis-administered
> was a bit harsh.
>

The extreme case is interesting to me, but even more interesting are
the intermediate cases. Is it straightforward to establish a hard and
fast threshold? i.e. do not growfs more than 1000% from original size?
Do not growfs more than X times?

Or is it a linear relationship between performance loss and each
additional growfs?


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-21 21:50         ` Chris Murphy
@ 2018-06-21 22:19           ` Dave Chinner
  2018-06-22  3:19             ` Chris Murphy
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2018-06-21 22:19 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Eric Sandeen, Luis R. Rodriguez, Darrick J. Wong,
	swadmin - levigo.de, xfs list

On Thu, Jun 21, 2018 at 03:50:11PM -0600, Chris Murphy wrote:
> On Thu, Jun 21, 2018 at 1:19 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> >
> >
> > On 6/21/18 2:15 PM, Luis R. Rodriguez wrote:
> >> On Tue, Jun 19, 2018 at 02:21:15PM -0500, Eric Sandeen wrote:
> >>> On 6/19/18 11:18 AM, Darrick J. Wong wrote:
> >>>> On Tue, Jun 19, 2018 at 02:27:29PM +0200, swadmin - levigo.de wrote:
> >>>>> Hi @all
> >>>>> I have a problem with mounting a large XFS filesystem which takes about
> >>>>> 8-10 minutes.
> >>>>>
> >>>>>
> >>>>>
> >>>>> :~# df -h /graylog_data
> >>>>> Filesystem                       Size  Used Avail Use% Mounted on
> >>>>> /dev/mapper/vgdata-graylog_data   11T  5.0T  5.1T  50% /graylog_data
> >>>>>
> >>>>> ----
> >>>>>
> >>>>> :~# xfs_info /dev/mapper/vgdata-graylog_data
> >>>>> meta-data=/dev/mapper/vgdata-graylog_data isize=512    agcount=40805,
> >>>>> agsize=65792 blks
> >>>>
> >>>> 41,000 AGs is a lot of metadata to load.  Did someone growfs a 1G fs
> >>>> into a 11T fs?
> >>>
> >>> <answer: yes, they did>
> >>>
> >>> Let me state that a little more clearly: this is a badly mis-administered
> >>> filesystem; 40805 x 256MB AGs is nearly unusable, as you've seen.
> >>>
> >>> If at all possible I would start over with a rationally-created filesystem
> >>> and migrate the data.
> >>
> >> Considering *a lot* of folks may typically follow the above "trap", wouldn't it
> >> be wise for userspace to complain or warn when the user may want to do
> >> something stupid like this? Otherwise I cannot see how we could possibly
> >> conceive that this is badly administered filesystem.
> >
> > Fair point, though I'm not sure where such a warning would go.  growfs?
> > I'm not a big fan of the "you asked for something unusual, continue [y/N]?"
> > type prompts.
> >
> > To people who know how xfs is laid out it's "obvious" but it's not fair to
> > assume every admin knows this, you're right.  So calling it mis-administered
> > was a bit harsh.
> >
> 
> The extreme case is interesting to me, but even more interesting are
> the intermediate cases. Is it straightforward to establish a hard and
> fast threshold? i.e. do not growfs more than 1000% from original size?
> Do not growfs more than X times?

Rule of thumb we've stated every time it's been asked in the past
10-15 years is "try not to grow by more than 10x the original size".

Too many allocation groups for a given storage size is bad in many
ways:

	- on spinning rust, more than 2 AGs per spindle decreases
	  general performance
	- small AGs don't hold large contiguous free spaces, leading
	  to increased file and freespace fragmentation (both almost
	  always end up being bad)
	- CPU efficiency of AG serach loops (e.g. finding free
	  space) goes way down, especially as the filesystem fills
	  up

The mkfs ratios are about as optimal as we can get for the
information we have about the storage - growing by
10x (i.e.  increaseing the number of AGs by 10x) puts us at the
outside edge of the acceptible filesystem performance and longevity
charcteristics. Growing by 100x puts us way outside the window,
and examples like this where we are taking about growing by 10000x
is just way beyond anything the static AG layout architecture was
ever intended to support....

Yes, the filesystem will still work, but unexpected delays and
non-deterministic behaviour will occur when algorithms have to
iterate all the AGs for some reason....

> Or is it a linear relationship between performance loss and each
> additional growfs?

The number of growfs operations is irrelevant - it is the
the AGs:capacity ratio that matters here.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-21 22:19           ` Dave Chinner
@ 2018-06-22  3:19             ` Chris Murphy
  2018-06-22  4:02               ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Murphy @ 2018-06-22  3:19 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Chris Murphy, Eric Sandeen, Luis R. Rodriguez, Darrick J. Wong,
	swadmin - levigo.de, xfs list

On Thu, Jun 21, 2018 at 4:19 PM, Dave Chinner <david@fromorbit.com> wrote:

> The mkfs ratios are about as optimal as we can get for the
> information we have about the storage - growing by
> 10x (i.e.  increaseing the number of AGs by 10x) puts us at the
> outside edge of the acceptible filesystem performance and longevity
> charcteristics. Growing by 100x puts us way outside the window,
> and examples like this where we are taking about growing by 10000x
> is just way beyond anything the static AG layout architecture was
> ever intended to support....

OK that's useful information, thanks.

What about from the other direction; is it possible to make an XFS
file system too big, on an LVM thin volume?

For example a 1TB drive, and I'm scratching my head at mkfs.xfs time
and think maaaybe one day it could end up 25TB at the top end? So I
figure do mkfs.xfs on a virtual LV of 5TB now and that gives me a 5x
growfs if I really do hit 25TB one day. But for now, it's a 5TB XFS
file system on a 1TB drive. Is there any negative performance effect
if it turns out I never end up growing this file system (it lives
forever on a 1TB drive as a 5TB virtual volume and file system)?



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-22  3:19             ` Chris Murphy
@ 2018-06-22  4:02               ` Dave Chinner
  2018-06-27 23:23                 ` Luis R. Rodriguez
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2018-06-22  4:02 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Eric Sandeen, Luis R. Rodriguez, Darrick J. Wong,
	swadmin - levigo.de, xfs list

On Thu, Jun 21, 2018 at 09:19:54PM -0600, Chris Murphy wrote:
> On Thu, Jun 21, 2018 at 4:19 PM, Dave Chinner <david@fromorbit.com> wrote:
> 
> > The mkfs ratios are about as optimal as we can get for the
> > information we have about the storage - growing by
> > 10x (i.e.  increaseing the number of AGs by 10x) puts us at the
> > outside edge of the acceptible filesystem performance and longevity
> > charcteristics. Growing by 100x puts us way outside the window,
> > and examples like this where we are taking about growing by 10000x
> > is just way beyond anything the static AG layout architecture was
> > ever intended to support....
> 
> OK that's useful information, thanks.
> 
> What about from the other direction; is it possible to make an XFS
> file system too big, on an LVM thin volume?

That's harder, but still possible. e.g. to make a 40,000 AG
filesystem using the mkfs defaults, we're talking about a *40PB*
filesystem. That's going to hit limitations in dm-thinp long before
XFs becomes a problem....

> For example a 1TB drive, and I'm scratching my head at mkfs.xfs time
> and think maaaybe one day it could end up 25TB at the top end?

That's within the realm of "should work fine, but is pushing the
boundaries".

> So I
> figure do mkfs.xfs on a virtual LV of 5TB now and that gives me a 5x
> growfs if I really do hit 25TB one day. But for now, it's a 5TB XFS
> file system on a 1TB drive. Is there any negative performance effect
> if it turns out I never end up growing this file system (it lives
> forever on a 1TB drive as a 5TB virtual volume and file system)?

There's no harm to XFS in doing this - this is the basic premise of
handling thin provisioning space accounting at the filesystem level,
and it's fundamental to my subvolume work.

dm-thinp might have other ideas about how sane it is in the long
term, however.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-22  4:02               ` Dave Chinner
@ 2018-06-27 23:23                 ` Luis R. Rodriguez
  2018-06-27 23:37                   ` Eric Sandeen
  0 siblings, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2018-06-27 23:23 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Chris Murphy, Eric Sandeen, Luis R. Rodriguez, Darrick J. Wong,
	swadmin - levigo.de, xfs list

On Fri, Jun 22, 2018 at 02:02:21PM +1000, Dave Chinner wrote:
> On Thu, Jun 21, 2018 at 09:19:54PM -0600, Chris Murphy wrote:
> > On Thu, Jun 21, 2018 at 4:19 PM, Dave Chinner <david@fromorbit.com> wrote:
> > 
> > > The mkfs ratios are about as optimal as we can get for the
> > > information we have about the storage - growing by
> > > 10x (i.e.  increaseing the number of AGs by 10x) puts us at the
> > > outside edge of the acceptible filesystem performance and longevity
> > > charcteristics. Growing by 100x puts us way outside the window,
> > > and examples like this where we are taking about growing by 10000x
> > > is just way beyond anything the static AG layout architecture was
> > > ever intended to support....

I don't  have time to test this but I can probably do so after my vacation
(now). Would it be best to just codify this eventually instead of having
this as tribal knowledge?

diff --git a/growfs/xfs_growfs.c b/growfs/xfs_growfs.c
index 8ec445afb74b..14f4b6dce08f 100644
--- a/growfs/xfs_growfs.c
+++ b/growfs/xfs_growfs.c
@@ -75,6 +75,9 @@ main(int argc, char **argv)
 	fs_path_t		*fs;	/* mount point information */
 	libxfs_init_t		xi;	/* libxfs structure */
 	char			rpath[PATH_MAX];
+	int			fflag;	/* -f flag */
+	long long		dsize_max_suggested; /* max suggested size */
+	long long		dsize_max_arch; /* max design over flow */
 
 	progname = basename(argv[0]);
 	setlocale(LC_ALL, "");
@@ -93,6 +96,8 @@ main(int argc, char **argv)
 		case 'd':
 			dflag = 1;
 			break;
+		case 'f':
+			fflag = 1;
 		case 'e':
 			esize = atol(optarg);
 			rflag = 1;
@@ -249,6 +254,24 @@ main(int argc, char **argv)
 	if (dflag | mflag | aflag) {
 		xfs_growfs_data_t	in;
 
+		/*
+		 * Growing the filesyste by 10x increases the AG size by 10 as
+		 * well, and this puts us outside edge of the acceptible
+		 * filesystem performance and longevity charcteristics.
+		 *
+		 * Growing by 100x puts us way outside the window...
+		 *
+		 * Growing by 10000x is just way beyond anything the static AG
+		 * layout architecture was ever intended to support, so unless
+		 * you use -f, we won't allow in between 10x-1000x.
+		 */
+		dsize_max_suggested = ddsize * 10 / (geo.blocksize / BBSIZE);
+		if (dsize_max_suggested < ddsize)
+			dsize_max_suggested = ULLONG_MAX;
+		dsize_max_arch = ddsize * 1000 / (geo.blocksize / BBSIZE);
+		if (dsize_max_arch < ddsize)
+			dsize_max_arch = ULLONG_MAX;
+
 		if (!mflag)
 			maxpct = geo.imaxpct;
 		if (!dflag && !aflag)	/* Only mflag, no data size change */
@@ -261,6 +284,26 @@ main(int argc, char **argv)
 				(long long)dsize,
 				(long long)(ddsize/(geo.blocksize/BBSIZE)));
 			error = 1;
+		} else if (!fflag &&
+			   dsize > dsize_max_arch) {
+			fprintf(stderr, _(
+				"data size %lld beyond what XFS recomends for "
+				"this fs, max should be %lld but if used you "
+				"will suffer. Max suggested is %lld or use "
+				"-f to override.\n"),
+				(long long)dsize,
+				dsize_max_arch,
+				dsize_max_suggested);
+			error = 1;
+		} else if (!fflag &&
+			   dsize > dsize_max_suggested) {
+			fprintf(stderr, _(
+				"data size %lld beyond what XFS recomends for "
+				"this fs, max suggested is %lld or use "
+				"-f to override.\n"),
+				(long long)dsize,
+				dsize_max_suggested);
+			error = 1;
 		}
 
 		if (!error && dsize < geo.datablocks) {

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-27 23:23                 ` Luis R. Rodriguez
@ 2018-06-27 23:37                   ` Eric Sandeen
  2018-06-28  2:05                     ` Dave Chinner
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2018-06-27 23:37 UTC (permalink / raw)
  To: Luis R. Rodriguez, Dave Chinner
  Cc: Chris Murphy, Darrick J. Wong, swadmin - levigo.de, xfs list



On 6/27/18 6:23 PM, Luis R. Rodriguez wrote:
> On Fri, Jun 22, 2018 at 02:02:21PM +1000, Dave Chinner wrote:
>> On Thu, Jun 21, 2018 at 09:19:54PM -0600, Chris Murphy wrote:
>>> On Thu, Jun 21, 2018 at 4:19 PM, Dave Chinner <david@fromorbit.com> wrote:
>>>
>>>> The mkfs ratios are about as optimal as we can get for the
>>>> information we have about the storage - growing by
>>>> 10x (i.e.  increaseing the number of AGs by 10x) puts us at the
>>>> outside edge of the acceptible filesystem performance and longevity
>>>> charcteristics. Growing by 100x puts us way outside the window,
>>>> and examples like this where we are taking about growing by 10000x
>>>> is just way beyond anything the static AG layout architecture was
>>>> ever intended to support....
> 
> I don't  have time to test this but I can probably do so after my vacation
> (now). Would it be best to just codify this eventually instead of having
> this as tribal knowledge?

Honestly, if we wanted something like this I think it'd be based on terminal
AG count, not growfs multiplier for a specific instance.

Otherwise 100 consecutive 4x growfs's would yield the same problems without
tripping any of these tests.

-Eric

> diff --git a/growfs/xfs_growfs.c b/growfs/xfs_growfs.c
> index 8ec445afb74b..14f4b6dce08f 100644
> --- a/growfs/xfs_growfs.c
> +++ b/growfs/xfs_growfs.c
> @@ -75,6 +75,9 @@ main(int argc, char **argv)
>  	fs_path_t		*fs;	/* mount point information */
>  	libxfs_init_t		xi;	/* libxfs structure */
>  	char			rpath[PATH_MAX];
> +	int			fflag;	/* -f flag */
> +	long long		dsize_max_suggested; /* max suggested size */
> +	long long		dsize_max_arch; /* max design over flow */
>  
>  	progname = basename(argv[0]);
>  	setlocale(LC_ALL, "");
> @@ -93,6 +96,8 @@ main(int argc, char **argv)
>  		case 'd':
>  			dflag = 1;
>  			break;
> +		case 'f':
> +			fflag = 1;
>  		case 'e':
>  			esize = atol(optarg);
>  			rflag = 1;
> @@ -249,6 +254,24 @@ main(int argc, char **argv)
>  	if (dflag | mflag | aflag) {
>  		xfs_growfs_data_t	in;
>  
> +		/*
> +		 * Growing the filesyste by 10x increases the AG size by 10 as
> +		 * well, and this puts us outside edge of the acceptible
> +		 * filesystem performance and longevity charcteristics.
> +		 *
> +		 * Growing by 100x puts us way outside the window...
> +		 *
> +		 * Growing by 10000x is just way beyond anything the static AG
> +		 * layout architecture was ever intended to support, so unless
> +		 * you use -f, we won't allow in between 10x-1000x.
> +		 */
> +		dsize_max_suggested = ddsize * 10 / (geo.blocksize / BBSIZE);
> +		if (dsize_max_suggested < ddsize)
> +			dsize_max_suggested = ULLONG_MAX;
> +		dsize_max_arch = ddsize * 1000 / (geo.blocksize / BBSIZE);
> +		if (dsize_max_arch < ddsize)
> +			dsize_max_arch = ULLONG_MAX;
> +
>  		if (!mflag)
>  			maxpct = geo.imaxpct;
>  		if (!dflag && !aflag)	/* Only mflag, no data size change */
> @@ -261,6 +284,26 @@ main(int argc, char **argv)
>  				(long long)dsize,
>  				(long long)(ddsize/(geo.blocksize/BBSIZE)));
>  			error = 1;
> +		} else if (!fflag &&
> +			   dsize > dsize_max_arch) {
> +			fprintf(stderr, _(
> +				"data size %lld beyond what XFS recomends for "
> +				"this fs, max should be %lld but if used you "
> +				"will suffer. Max suggested is %lld or use "
> +				"-f to override.\n"),
> +				(long long)dsize,
> +				dsize_max_arch,
> +				dsize_max_suggested);
> +			error = 1;
> +		} else if (!fflag &&
> +			   dsize > dsize_max_suggested) {
> +			fprintf(stderr, _(
> +				"data size %lld beyond what XFS recomends for "
> +				"this fs, max suggested is %lld or use "
> +				"-f to override.\n"),
> +				(long long)dsize,
> +				dsize_max_suggested);
> +			error = 1;
>  		}
>  
>  		if (!error && dsize < geo.datablocks) {
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-27 23:37                   ` Eric Sandeen
@ 2018-06-28  2:05                     ` Dave Chinner
  2018-06-28  8:19                       ` Carlos Maiolino
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2018-06-28  2:05 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Luis R. Rodriguez, Chris Murphy, Darrick J. Wong,
	swadmin - levigo.de, xfs list

On Wed, Jun 27, 2018 at 06:37:31PM -0500, Eric Sandeen wrote:
> 
> 
> On 6/27/18 6:23 PM, Luis R. Rodriguez wrote:
> > On Fri, Jun 22, 2018 at 02:02:21PM +1000, Dave Chinner wrote:
> >> On Thu, Jun 21, 2018 at 09:19:54PM -0600, Chris Murphy wrote:
> >>> On Thu, Jun 21, 2018 at 4:19 PM, Dave Chinner <david@fromorbit.com> wrote:
> >>>
> >>>> The mkfs ratios are about as optimal as we can get for the
> >>>> information we have about the storage - growing by
> >>>> 10x (i.e.  increaseing the number of AGs by 10x) puts us at the
> >>>> outside edge of the acceptible filesystem performance and longevity
> >>>> charcteristics. Growing by 100x puts us way outside the window,
> >>>> and examples like this where we are taking about growing by 10000x
> >>>> is just way beyond anything the static AG layout architecture was
> >>>> ever intended to support....
> > 
> > I don't  have time to test this but I can probably do so after my vacation
> > (now). Would it be best to just codify this eventually instead of having
> > this as tribal knowledge?
> 
> Honestly, if we wanted something like this I think it'd be based on terminal
> AG count, not growfs multiplier for a specific instance.

IMO, this belongs in the admin documentation (e.g. the growfs man
page), not the code. The people writing apps and automated
deployment scripts that grow filesystems need to know about this,
not the end users who simply use these pre-canned
apps/environments...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Mounting xfs filesystem takes long time
  2018-06-28  2:05                     ` Dave Chinner
@ 2018-06-28  8:19                       ` Carlos Maiolino
  0 siblings, 0 replies; 14+ messages in thread
From: Carlos Maiolino @ 2018-06-28  8:19 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, Luis R. Rodriguez, Darrick J. Wong, xfs list

On Thu, Jun 28, 2018 at 12:05:04PM +1000, Dave Chinner wrote:
> On Wed, Jun 27, 2018 at 06:37:31PM -0500, Eric Sandeen wrote:
> > 
> > 
> > On 6/27/18 6:23 PM, Luis R. Rodriguez wrote:
> > > On Fri, Jun 22, 2018 at 02:02:21PM +1000, Dave Chinner wrote:
> > >> On Thu, Jun 21, 2018 at 09:19:54PM -0600, Chris Murphy wrote:
> > >>> On Thu, Jun 21, 2018 at 4:19 PM, Dave Chinner <david@fromorbit.com> wrote:
> > >>>
> > >>>> The mkfs ratios are about as optimal as we can get for the
> > >>>> information we have about the storage - growing by
> > >>>> 10x (i.e.  increaseing the number of AGs by 10x) puts us at the
> > >>>> outside edge of the acceptible filesystem performance and longevity
> > >>>> charcteristics. Growing by 100x puts us way outside the window,
> > >>>> and examples like this where we are taking about growing by 10000x
> > >>>> is just way beyond anything the static AG layout architecture was
> > >>>> ever intended to support....
> > > 
> > > I don't  have time to test this but I can probably do so after my vacation
> > > (now). Would it be best to just codify this eventually instead of having
> > > this as tribal knowledge?
> > 
> > Honestly, if we wanted something like this I think it'd be based on terminal
> > AG count, not growfs multiplier for a specific instance.
> 

Agreed. The biggest issue here is the amount of AGs, not the filesystem size
directly, which, for this to work in a reliable way, we'd need to store the
'original' AG count from when the FS was created, otherwise, as Eric stated,
nothing prevents somebody to use growfs several times, instead of growing the FS
in a single time.

Although, this makes me wonder if we somehow couldn't make it a bit more
flexible in the future, but I think that's where Dave's subvolume work comes in?


> IMO, this belongs in the admin documentation (e.g. the growfs man
> page), not the code. The people writing apps and automated
> deployment scripts that grow filesystems need to know about this,
> not the end users who simply use these pre-canned
> apps/environments...
> 

+1, most people who will look into the code already know this, users of grow FS
doesn't, so this belongs to the man page IMHO.

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Carlos

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-06-28  8:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-19 12:27 Mounting xfs filesystem takes long time swadmin - levigo.de
2018-06-19 16:00 ` Emmanuel Florac
2018-06-19 16:18 ` Darrick J. Wong
2018-06-19 19:21   ` Eric Sandeen
2018-06-21 19:15     ` Luis R. Rodriguez
2018-06-21 19:19       ` Eric Sandeen
2018-06-21 21:50         ` Chris Murphy
2018-06-21 22:19           ` Dave Chinner
2018-06-22  3:19             ` Chris Murphy
2018-06-22  4:02               ` Dave Chinner
2018-06-27 23:23                 ` Luis R. Rodriguez
2018-06-27 23:37                   ` Eric Sandeen
2018-06-28  2:05                     ` Dave Chinner
2018-06-28  8:19                       ` Carlos Maiolino

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.