All of lore.kernel.org
 help / color / mirror / Atom feed
* xfsrestore performance
@ 2016-05-27 22:39 xfs.pkoch
  2016-05-28  9:25 ` xfs.pkoch
  2016-05-29 23:20 ` Dave Chinner
  0 siblings, 2 replies; 6+ messages in thread
From: xfs.pkoch @ 2016-05-27 22:39 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 5802 bytes --]

Dear XFS experts,

I was using a 16TB linux mdraid raid10 volume built from 16 seagate
2TB disks, which was formatted with an ext3 filesystem. It contained
a couple of hundred very large files (ZFS full and incremental dumps
with sizes between 10GB and 400GB). It also contained 7 million
files from our users home directories, which where backuped with
rsync --link-dest=<last backup dir>, so most of these files are
just hard links to previous versions.

Two weeks ago I increased the volume from 16TB to 20TB which
can be done with linux mdraid while still using the filesystem.
Reshaping lasts two days. Then I umounted the ext3 filesystem
to grow it from 16TB to 20TB. And guess what - ext3 does not
support filesystem > 16TB.

I decided to change the filesystem to XFS but the system must be
online during weekdays so I have a 48 hour timeframe to do
lenghty copies.

I added a 16TB temporary RAID5 volume to the machine and here's
what I did so far:

1: create a 14TB XFS-filesystem on the temporary RAID5-volume
2: first rsync run to copy the ext3 fs to the temporary XFS-fs,
this took 6 days
3: another rsync run to copy what changed during the first run,
this took another 2 days
4: another rsync run to copy what changed during the second run,
this took another day
5: xfsdump the temporary xfs fs to /dev/null. took 20 hours
5: remounting the ext3 fs readonly and do a final rsync run to
copy what changed during the third run. This took 10 hours.
6: delete the ext3 fs and create a 20TB xfs fs
7: copy back the temporary xfs fs to the new xfs fs using
xfsdump | xfsrestore

Here's my problem Since dumping the temporary xfs fs to /dev/null
needed less than a day I expected the xfsdump | xfsrestore
combination to be finished in less than 2 day. xfsdump | xfsrestore
should be a lot fasten than rsync since it justs pumps blocks from
one xfs fs into another one.

But either xfsrestore is painfully slow or I did something wrong:

Please have a look:

root@backup:/var/tmp# xfsdump -J -p600 - /xtmp | xfsrestore -J -a /var/tmp
- /xtmp2
xfsrestore: using file dump (drive_simple) strategy
xfsdump: using file dump (drive_simple) strategy
xfsrestore: version 3.1.3 (dump format 3.0)
xfsdump: version 3.1.3 (dump format 3.0)
xfsrestore: searching media for dump
xfsdump: level 0 dump of backup:/xtmp
xfsdump: dump date: Fri May 27 13:15:42 2016
xfsdump: session id: adb95c2e-332b-4dde-9c8b-e03760d5a83b
xfsdump: session label: ""
xfsdump: ino map phase 1: constructing initial dump list
xfsdump: status at 13:25:42: inomap phase 1 14008321/28643415 inos scanned,
600 seconds elapsed
xfsdump: ino map phase 2: skipping (no pruning necessary)
xfsdump: ino map phase 3: skipping (only one dump stream)
xfsdump: ino map construction complete
xfsdump: estimated dump size: 12831156312640 bytes
xfsdump: creating dump session media file 0 (media 0, file 0)
xfsdump: dumping ino map
xfsrestore: examining media file 0
xfsrestore: dump description:
xfsrestore: hostname: backup
xfsrestore: mount point: /xtmp
xfsrestore: volume: /dev/md6
xfsrestore: session time: Fri May 27 13:15:42 2016
xfsrestore: level: 0
xfsrestore: session label: ""
xfsrestore: media label: ""
xfsrestore: file system id: 29825afd-5d7e-485f-9eb1-8871a21ce71d
xfsrestore: session id: adb95c2e-332b-4dde-9c8b-e03760d5a83b
xfsrestore: media id: 5ef22542-774a-4504-a823-d007d2ce4720
xfsrestore: searching media for directory dump
xfsrestore: NOTE: attempt to reserve 1162387864 bytes for
/var/tmp/xfsrestorehousekeepingdir/dirattr using XFS_IOC_RESVSP64 failed:
Operation not supported (95)
xfsrestore: NOTE: attempt to reserve 286438226 bytes for
/var/tmp/xfsrestorehousekeepingdir/namreg using XFS_IOC_RESVSP64 failed:
Operation not supported (95)
xfsrestore: reading directories
xfsdump: dumping directories
xfsdump: dumping non-directory files
xfsdump: status at 20:04:52: 1/7886560 files dumped, 0.0% data dumped,
24550 seconds elapsed
xfsrestore: 20756853 directories and 274128228 entries processed
xfsrestore: directory post-processing
xfsrestore: restoring non-directory files
xfsdump: status at 21:27:27: 26/7886560 files dumped, 0.0% data dumped,
29505 seconds elapsed
xfsdump: status at 21:35:46: 20930/7886560 files dumped, 0.0% data dumped,
30004 seconds elapsed
xfsdump: status at 21:46:26: 46979/7886560 files dumped, 0.1% data dumped,
30644 seconds elapsed
xfsdump: status at 21:55:52: 51521/7886560 files dumped, 0.1% data dumped,
31210 seconds elapsed
xfsdump: status at 22:05:45: 57770/7886560 files dumped, 0.1% data dumped,
31803 seconds elapsed
xfsdump: status at 22:15:43: 63142/7886560 files dumped, 0.1% data dumped,
32401 seconds elapsed
xfsdump: status at 22:25:42: 73621/7886560 files dumped, 0.1% data dumped,
33000 seconds elapsed
xfsdump: status at 22:35:51: 91223/7886560 files dumped, 0.1% data dumped,
33609 seconds elapsed
xfsdump: status at 22:45:42: 94096/7886560 files dumped, 0.2% data dumped,
34200 seconds elapsed
xfsdump: status at 22:55:42: 96702/7886560 files dumped, 0.2% data dumped,
34800 seconds elapsed
xfsdump: status at 23:05:42: 102808/7886560 files dumped, 0.2% data dumped,
35400 seconds elapsed
xfsdump: status at 23:16:15: 107096/7886560 files dumped, 0.2% data dumped,
36033 seconds elapsed
xfsdump: status at 23:25:47: 109079/7886560 files dumped, 0.2% data dumped,
36605 seconds elapsed
xfsdump: status at 23:35:52: 112318/7886560 files dumped, 0.2% data dumped,
37210 seconds elapsed
xfsdump: status at 23:45:46: 114975/7886560 files dumped, 0.2% data dumped,
37804 seconds elapsed
xfsdump: status at 23:55:55: 117260/7886560 files dumped, 0.2% data dumped,
38413 seconds elapsed
xfsdump: status at 00:05:44: 118722/7886560 files dumped, 0.2% data dumped,
39002 seconds elapsed

Seems like 2 days was a little optimistic

Any ideas what's going wrong here?

Peter Koch

[-- Attachment #1.2: Type: text/html, Size: 6493 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfsrestore performance
  2016-05-27 22:39 xfsrestore performance xfs.pkoch
@ 2016-05-28  9:25 ` xfs.pkoch
  2016-05-28 13:23   ` Brian Foster
  2016-05-30 12:52   ` Emmanuel Florac
  2016-05-29 23:20 ` Dave Chinner
  1 sibling, 2 replies; 6+ messages in thread
From: xfs.pkoch @ 2016-05-28  9:25 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 1028 bytes --]

Dear XFS experts,

I checked the situation this morning:

....
xfsdump: status at 10:25:53: 473543/7886560 files dumped, 1.3% data dumped,
76211 seconds elapsed
xfsdump: status at 10:35:46: 478508/7886560 files dumped, 1.3% data dumped,
76804 seconds elapsed

and decided to stop the process. With this kind of speed xfsdump |
xfsrestore
will need more than one month. Something must be seriously wrong here.

I see two options: Use rsync again to copy the data from the 14TB xfs
filesystem to the new 20TB xfs filesystem. This won't be finished
this weekend.

Or use dd to copy the 14TB XFS filesystem into the 20TB volume and
then grow the filesystem.

dd runs at 300MB/sec that's approx 1TB per hour, so I decided to go this
way.

So here's another question: The new filesystem will run on a 20 disk raid10
volume and was copied from a 16 disk raid5 volume. So swidth will be wrong.
Also all the data will be within the first 15TB.

What should I do to fix this? Or will xfs_growfs fix it automatically?

Regards

Peter Koch

[-- Attachment #1.2: Type: text/html, Size: 1226 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfsrestore performance
  2016-05-28  9:25 ` xfs.pkoch
@ 2016-05-28 13:23   ` Brian Foster
  2016-05-30 12:52   ` Emmanuel Florac
  1 sibling, 0 replies; 6+ messages in thread
From: Brian Foster @ 2016-05-28 13:23 UTC (permalink / raw)
  To: xfs.pkoch; +Cc: xfs

On Sat, May 28, 2016 at 11:25:32AM +0200, xfs.pkoch@dfgh.net wrote:
> Dear XFS experts,
> 
> I checked the situation this morning:
> 
> ....
> xfsdump: status at 10:25:53: 473543/7886560 files dumped, 1.3% data dumped,
> 76211 seconds elapsed
> xfsdump: status at 10:35:46: 478508/7886560 files dumped, 1.3% data dumped,
> 76804 seconds elapsed
> 
> and decided to stop the process. With this kind of speed xfsdump |
> xfsrestore
> will need more than one month. Something must be seriously wrong here.
> 

I don't have much experience working with xfsdump so I couldn't really
comment here without spending some significant time playing around with
it. Hopefully somebody else can chime in on this.

> I see two options: Use rsync again to copy the data from the 14TB xfs
> filesystem to the new 20TB xfs filesystem. This won't be finished
> this weekend.
> 

Could you temporarily promote your 14TB fs to production in order to
make the data available while an rsync process migrates the data back
over to the properly formatted 20TB fs?

> Or use dd to copy the 14TB XFS filesystem into the 20TB volume and
> then grow the filesystem.
> 
> dd runs at 300MB/sec that's approx 1TB per hour, so I decided to go this
> way.
> 
> So here's another question: The new filesystem will run on a 20 disk raid10
> volume and was copied from a 16 disk raid5 volume. So swidth will be wrong.
> Also all the data will be within the first 15TB.
> 
> What should I do to fix this? Or will xfs_growfs fix it automatically?
> 

xfs_growfs will add more allocation groups based on the size of the
current storage. It won't reallocate existing files or anything of that
nature.

You can't change the stripe unit/width parameters after the fact, afaik.
The only way to accomplish this appropriately that I can think of is to
forcibly reformat the temporary filesystem with the settings targeted to
the longer term storage, reimport the data from the production volume
and then copy over the raw block device.

Brian

> Regards
> 
> Peter Koch

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfsrestore performance
  2016-05-27 22:39 xfsrestore performance xfs.pkoch
  2016-05-28  9:25 ` xfs.pkoch
@ 2016-05-29 23:20 ` Dave Chinner
  1 sibling, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2016-05-29 23:20 UTC (permalink / raw)
  To: xfs.pkoch; +Cc: xfs

On Sat, May 28, 2016 at 12:39:45AM +0200, xfs.pkoch@dfgh.net wrote:
> Dear XFS experts,
> 
> I was using a 16TB linux mdraid raid10 volume built from 16 seagate
> 2TB disks, which was formatted with an ext3 filesystem. It contained
> a couple of hundred very large files (ZFS full and incremental dumps
> with sizes between 10GB and 400GB). It also contained 7 million
> files from our users home directories, which where backuped with
> rsync --link-dest=<last backup dir>, so most of these files are
> just hard links to previous versions.

Oh, dear. There's a massive red flag. I'll come back to it...

> 1: create a 14TB XFS-filesystem on the temporary RAID5-volume
> 2: first rsync run to copy the ext3 fs to the temporary XFS-fs,
> this took 6 days

Rsync took 6 days to copy the filesystem - I doubt dump/restore is
going to be 3x faster than that - xfsdump is much faster than rsync
on the read side, but xfsdump runs at about the same speed as rsync 
on the write side. As such, 2x faster is about as much as you can
expect for a "data-mostly" dump/restore. I'll come back to this....

> 3: another rsync run to copy what changed during the first run,
> this took another 2 days
> 4: another rsync run to copy what changed during the second run,
> this took another day
> 5: xfsdump the temporary xfs fs to /dev/null. took 20 hours

Nothing to slow down xfsdump reading from disk. Benchmarks lie.

> 5: remounting the ext3 fs readonly and do a final rsync run to
> copy what changed during the third run. This took 10 hours.
> 6: delete the ext3 fs and create a 20TB xfs fs
> 7: copy back the temporary xfs fs to the new xfs fs using
> xfsdump | xfsrestore
> 
> Here's my problem Since dumping the temporary xfs fs to /dev/null
> needed less than a day I expected the xfsdump | xfsrestore
> combination to be finished in less than 2 day. xfsdump | xfsrestore
> should be a lot fasten than rsync since it justs pumps blocks from
> one xfs fs into another one.

dump is fast - restore is the slow point because it has to recreate
everything. That's what limits the speed of dump - the pipe has a
bound limit on data in flight, so dump is throttled to restore
speed when you run this.

And, as I said I'll come back to, restore is slow because:

[....]
> xfsrestore: reading directories
> xfsdump: dumping directories
> xfsdump: dumping non-directory files
> xfsdump: status at 20:04:52: 1/7886560 files dumped, 0.0% data dumped,
> 24550 seconds elapsed
> xfsrestore: 20756853 directories and 274128228 entries processed
> xfsrestore: directory post-processing
> xfsrestore: restoring non-directory files

The filesystem is not exactly as you described.  Did you notice that
xfs_restore realises that it has to restore 20 million directories
and *274 million* directory entries? i.e. for those 7 million inodes
containing data, there is roughly 40 hard links pointing to each
inode. There are also 3 directory inodes for every regular file.
This is not a "data mostly" filesystem - it has vastly more metadata
than it has data, even though the data takes up more space.

Keep in mind that it took dump the best part of 7 hours just to read
all the inodes and the directory structure to build the dump
inventory. This matches with the final ext3 rsync pass of 10 hours
which should have copied very little data.  Creating 270 million
hard links in 20 million directories from scratch takes a long time,
and xfs_restore will be no faster at that than rsync....

> Seems like 2 days was a little optimistic

Just a little. :/

Personally, I would have copied the data using rsync to the
temporary XFS filesystem of the same size and shape of the final
destination (via mkfs parameters to ensure stripe unit/width match
final destination) and then used xfs_copy to do a block level copy
of the temporary filesystem back to the final destination. xfs_copy
will run *much* faster than xfsdump/restore....

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfsrestore performance
  2016-05-28  9:25 ` xfs.pkoch
  2016-05-28 13:23   ` Brian Foster
@ 2016-05-30 12:52   ` Emmanuel Florac
  1 sibling, 0 replies; 6+ messages in thread
From: Emmanuel Florac @ 2016-05-30 12:52 UTC (permalink / raw)
  To: xfs.pkoch; +Cc: xfs

Le Sat, 28 May 2016 11:25:32 +0200
xfs.pkoch@dfgh.net écrivait:

> I see two options: Use rsync again to copy the data from the 14TB xfs
> filesystem to the new 20TB xfs filesystem. This won't be finished
> this weekend.

Just a point; in my experience good old "cp -a" is generally much faster
than rsync for the first copy.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfsrestore performance
@ 2016-05-31  9:39 xfs.pkoch
  0 siblings, 0 replies; 6+ messages in thread
From: xfs.pkoch @ 2016-05-31  9:39 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 4304 bytes --]

Dear Dave:

Thanks very much for your explanations

2016-05-30 1:20 GMT+02:00 Dave Chinner - david@fromorbit.com <
xfs.pkoch.2540fe3cfd.david#fromorbit.com@ob.0sg.net>:

> ....
> Oh, dear. There's a massive red flag. I'll come back to it...
>



> > 5: xfsdump the temporary xfs fs to /dev/null. took 20 hours
>
> Nothing to slow down xfsdump reading from disk. Benchmarks lie.
>

dump is fast - restore is the slow point because it has to recreate
> everything. That's what limits the speed of dump - the pipe has a
> bound limit on data in flight, so dump is throttled to restore
> speed when you run this.
>
> And, as I said I'll come back to, restore is slow because:
>

The filesystem is not exactly as you described.  Did you notice that
> xfs_restore realises that it has to restore 20 million directories
> and *274 million* directory entries? i.e. for those 7 million inodes
> containing data, there is roughly 40 hard links pointing to each
> inode. There are also 3 directory inodes for every regular file.
> This is not a "data mostly" filesystem - it has vastly more metadata
> than it has data, even though the data takes up more space.
>

Our backup-server has 46 versions of our home-directories and 158
versions of our mailserver, so if a file has not been changed for more
than a year it will exist once on the backup server together with
45 / 157 hard links.

I'm astonished myself. Firstly about the numbers and also about
the fact that our backup-strategy does work quite well.

Also rsync does a very good job. It was able to copy all these hard links
in 6 days from a 16TB ext3 filesystem on a RAID10-volume to a
15TB xfs filesystem on a RAID5-volume.

And right now 4 rsync processes are copying the 15TB xfs filesystem
back to a 20TB xfs-filesystem. And it seems as if this will finish
today (after 3 days only). Very nice.

Keep in mind that it took dump the best part of 7 hours just to read
> all the inodes and the directory structure to build the dump
> inventory. This matches with the final ext3 rsync pass of 10 hours
> which should have copied very little data.  Creating 270 million
> hard links in 20 million directories from scratch takes a long time,
> and xfs_restore will be no faster at that than rsync....
>

That was my misunderstanding. I was believing/hoping that a tool
that was built for a specific filesystem would outperform a generic
tool like rsync. I thought xfsdump would write all used filesystem
blocks into a data stream and xfsrestore would just read the
blocks from stdin and write them back to the destination filesystem.
Much like a dd-process that knows about the device-content and
can skip unused blocks.


> > Seems like 2 days was a little optimistic
>
> Just a little. :/
>

It would have taken approx 1000 hours


> Personally, I would have copied the data using rsync to the
> temporary XFS filesystem of the same size and shape of the final
> destination (via mkfs parameters to ensure stripe unit/width match
> final destination) and then used xfs_copy to do a block level copy
> of the temporary filesystem back to the final destination. xfs_copy
> will run *much* faster than xfsdump/restore....
>

Next time I will do it like you suggest with one minor change. Instead
of xfs_copy I would use dd, which makes sense if the filesystem is
almost filled. Or do you believe that xfs_copy is faster then dd?
Or will the xfs_growfs create any problems?

I used dd on saturday to copy the 15TB xfs filesystem back
into the 20TB raid10 volume and enlarged the filesystem with
xfs_growfs. The result was a xfs-filesystem with layout-parameters
matching the temporary raid5 volume built from 16 1TB disks
with a 256K chunksize. But the new raid10-volume consists of
20 2TB disks using a chunksize of 512K. And growing the filesystem
raised the allocation group count from 32 to 45.

I reformatted the 20TB volume with a fresh xfs-filesystem and I
let mkfs.xfs decide about the layout.

Does that give me an optimal layout? I will enlarge the filesystem
in the future. This will increase my allocation group count. Is that
a problem that I should better have avoided in advance by reducing
the agcount?

Kind regards and thanks very much for the useful infose

Peter Koch

-- 
Peter Koch
Passauer Strasse 32, 47249 Duisburg
Tel.: 0172 2470263

[-- Attachment #1.2: Type: text/html, Size: 6158 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-05-31  9:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-27 22:39 xfsrestore performance xfs.pkoch
2016-05-28  9:25 ` xfs.pkoch
2016-05-28 13:23   ` Brian Foster
2016-05-30 12:52   ` Emmanuel Florac
2016-05-29 23:20 ` Dave Chinner
2016-05-31  9:39 xfs.pkoch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.