All of lore.kernel.org
 help / color / mirror / Atom feed
* xfsrestore performance
@ 2016-05-27 22:39 xfs.pkoch
  2016-05-28  9:25 ` xfs.pkoch
  2016-05-29 23:20 ` Dave Chinner
  0 siblings, 2 replies; 6+ messages in thread
From: xfs.pkoch @ 2016-05-27 22:39 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 5802 bytes --]

Dear XFS experts,

I was using a 16TB linux mdraid raid10 volume built from 16 seagate
2TB disks, which was formatted with an ext3 filesystem. It contained
a couple of hundred very large files (ZFS full and incremental dumps
with sizes between 10GB and 400GB). It also contained 7 million
files from our users home directories, which where backuped with
rsync --link-dest=<last backup dir>, so most of these files are
just hard links to previous versions.

Two weeks ago I increased the volume from 16TB to 20TB which
can be done with linux mdraid while still using the filesystem.
Reshaping lasts two days. Then I umounted the ext3 filesystem
to grow it from 16TB to 20TB. And guess what - ext3 does not
support filesystem > 16TB.

I decided to change the filesystem to XFS but the system must be
online during weekdays so I have a 48 hour timeframe to do
lenghty copies.

I added a 16TB temporary RAID5 volume to the machine and here's
what I did so far:

1: create a 14TB XFS-filesystem on the temporary RAID5-volume
2: first rsync run to copy the ext3 fs to the temporary XFS-fs,
this took 6 days
3: another rsync run to copy what changed during the first run,
this took another 2 days
4: another rsync run to copy what changed during the second run,
this took another day
5: xfsdump the temporary xfs fs to /dev/null. took 20 hours
5: remounting the ext3 fs readonly and do a final rsync run to
copy what changed during the third run. This took 10 hours.
6: delete the ext3 fs and create a 20TB xfs fs
7: copy back the temporary xfs fs to the new xfs fs using
xfsdump | xfsrestore

Here's my problem Since dumping the temporary xfs fs to /dev/null
needed less than a day I expected the xfsdump | xfsrestore
combination to be finished in less than 2 day. xfsdump | xfsrestore
should be a lot fasten than rsync since it justs pumps blocks from
one xfs fs into another one.

But either xfsrestore is painfully slow or I did something wrong:

Please have a look:

root@backup:/var/tmp# xfsdump -J -p600 - /xtmp | xfsrestore -J -a /var/tmp
- /xtmp2
xfsrestore: using file dump (drive_simple) strategy
xfsdump: using file dump (drive_simple) strategy
xfsrestore: version 3.1.3 (dump format 3.0)
xfsdump: version 3.1.3 (dump format 3.0)
xfsrestore: searching media for dump
xfsdump: level 0 dump of backup:/xtmp
xfsdump: dump date: Fri May 27 13:15:42 2016
xfsdump: session id: adb95c2e-332b-4dde-9c8b-e03760d5a83b
xfsdump: session label: ""
xfsdump: ino map phase 1: constructing initial dump list
xfsdump: status at 13:25:42: inomap phase 1 14008321/28643415 inos scanned,
600 seconds elapsed
xfsdump: ino map phase 2: skipping (no pruning necessary)
xfsdump: ino map phase 3: skipping (only one dump stream)
xfsdump: ino map construction complete
xfsdump: estimated dump size: 12831156312640 bytes
xfsdump: creating dump session media file 0 (media 0, file 0)
xfsdump: dumping ino map
xfsrestore: examining media file 0
xfsrestore: dump description:
xfsrestore: hostname: backup
xfsrestore: mount point: /xtmp
xfsrestore: volume: /dev/md6
xfsrestore: session time: Fri May 27 13:15:42 2016
xfsrestore: level: 0
xfsrestore: session label: ""
xfsrestore: media label: ""
xfsrestore: file system id: 29825afd-5d7e-485f-9eb1-8871a21ce71d
xfsrestore: session id: adb95c2e-332b-4dde-9c8b-e03760d5a83b
xfsrestore: media id: 5ef22542-774a-4504-a823-d007d2ce4720
xfsrestore: searching media for directory dump
xfsrestore: NOTE: attempt to reserve 1162387864 bytes for
/var/tmp/xfsrestorehousekeepingdir/dirattr using XFS_IOC_RESVSP64 failed:
Operation not supported (95)
xfsrestore: NOTE: attempt to reserve 286438226 bytes for
/var/tmp/xfsrestorehousekeepingdir/namreg using XFS_IOC_RESVSP64 failed:
Operation not supported (95)
xfsrestore: reading directories
xfsdump: dumping directories
xfsdump: dumping non-directory files
xfsdump: status at 20:04:52: 1/7886560 files dumped, 0.0% data dumped,
24550 seconds elapsed
xfsrestore: 20756853 directories and 274128228 entries processed
xfsrestore: directory post-processing
xfsrestore: restoring non-directory files
xfsdump: status at 21:27:27: 26/7886560 files dumped, 0.0% data dumped,
29505 seconds elapsed
xfsdump: status at 21:35:46: 20930/7886560 files dumped, 0.0% data dumped,
30004 seconds elapsed
xfsdump: status at 21:46:26: 46979/7886560 files dumped, 0.1% data dumped,
30644 seconds elapsed
xfsdump: status at 21:55:52: 51521/7886560 files dumped, 0.1% data dumped,
31210 seconds elapsed
xfsdump: status at 22:05:45: 57770/7886560 files dumped, 0.1% data dumped,
31803 seconds elapsed
xfsdump: status at 22:15:43: 63142/7886560 files dumped, 0.1% data dumped,
32401 seconds elapsed
xfsdump: status at 22:25:42: 73621/7886560 files dumped, 0.1% data dumped,
33000 seconds elapsed
xfsdump: status at 22:35:51: 91223/7886560 files dumped, 0.1% data dumped,
33609 seconds elapsed
xfsdump: status at 22:45:42: 94096/7886560 files dumped, 0.2% data dumped,
34200 seconds elapsed
xfsdump: status at 22:55:42: 96702/7886560 files dumped, 0.2% data dumped,
34800 seconds elapsed
xfsdump: status at 23:05:42: 102808/7886560 files dumped, 0.2% data dumped,
35400 seconds elapsed
xfsdump: status at 23:16:15: 107096/7886560 files dumped, 0.2% data dumped,
36033 seconds elapsed
xfsdump: status at 23:25:47: 109079/7886560 files dumped, 0.2% data dumped,
36605 seconds elapsed
xfsdump: status at 23:35:52: 112318/7886560 files dumped, 0.2% data dumped,
37210 seconds elapsed
xfsdump: status at 23:45:46: 114975/7886560 files dumped, 0.2% data dumped,
37804 seconds elapsed
xfsdump: status at 23:55:55: 117260/7886560 files dumped, 0.2% data dumped,
38413 seconds elapsed
xfsdump: status at 00:05:44: 118722/7886560 files dumped, 0.2% data dumped,
39002 seconds elapsed

Seems like 2 days was a little optimistic

Any ideas what's going wrong here?

Peter Koch

[-- Attachment #1.2: Type: text/html, Size: 6493 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: xfsrestore performance
@ 2016-05-31  9:39 xfs.pkoch
  0 siblings, 0 replies; 6+ messages in thread
From: xfs.pkoch @ 2016-05-31  9:39 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 4304 bytes --]

Dear Dave:

Thanks very much for your explanations

2016-05-30 1:20 GMT+02:00 Dave Chinner - david@fromorbit.com <
xfs.pkoch.2540fe3cfd.david#fromorbit.com@ob.0sg.net>:

> ....
> Oh, dear. There's a massive red flag. I'll come back to it...
>



> > 5: xfsdump the temporary xfs fs to /dev/null. took 20 hours
>
> Nothing to slow down xfsdump reading from disk. Benchmarks lie.
>

dump is fast - restore is the slow point because it has to recreate
> everything. That's what limits the speed of dump - the pipe has a
> bound limit on data in flight, so dump is throttled to restore
> speed when you run this.
>
> And, as I said I'll come back to, restore is slow because:
>

The filesystem is not exactly as you described.  Did you notice that
> xfs_restore realises that it has to restore 20 million directories
> and *274 million* directory entries? i.e. for those 7 million inodes
> containing data, there is roughly 40 hard links pointing to each
> inode. There are also 3 directory inodes for every regular file.
> This is not a "data mostly" filesystem - it has vastly more metadata
> than it has data, even though the data takes up more space.
>

Our backup-server has 46 versions of our home-directories and 158
versions of our mailserver, so if a file has not been changed for more
than a year it will exist once on the backup server together with
45 / 157 hard links.

I'm astonished myself. Firstly about the numbers and also about
the fact that our backup-strategy does work quite well.

Also rsync does a very good job. It was able to copy all these hard links
in 6 days from a 16TB ext3 filesystem on a RAID10-volume to a
15TB xfs filesystem on a RAID5-volume.

And right now 4 rsync processes are copying the 15TB xfs filesystem
back to a 20TB xfs-filesystem. And it seems as if this will finish
today (after 3 days only). Very nice.

Keep in mind that it took dump the best part of 7 hours just to read
> all the inodes and the directory structure to build the dump
> inventory. This matches with the final ext3 rsync pass of 10 hours
> which should have copied very little data.  Creating 270 million
> hard links in 20 million directories from scratch takes a long time,
> and xfs_restore will be no faster at that than rsync....
>

That was my misunderstanding. I was believing/hoping that a tool
that was built for a specific filesystem would outperform a generic
tool like rsync. I thought xfsdump would write all used filesystem
blocks into a data stream and xfsrestore would just read the
blocks from stdin and write them back to the destination filesystem.
Much like a dd-process that knows about the device-content and
can skip unused blocks.


> > Seems like 2 days was a little optimistic
>
> Just a little. :/
>

It would have taken approx 1000 hours


> Personally, I would have copied the data using rsync to the
> temporary XFS filesystem of the same size and shape of the final
> destination (via mkfs parameters to ensure stripe unit/width match
> final destination) and then used xfs_copy to do a block level copy
> of the temporary filesystem back to the final destination. xfs_copy
> will run *much* faster than xfsdump/restore....
>

Next time I will do it like you suggest with one minor change. Instead
of xfs_copy I would use dd, which makes sense if the filesystem is
almost filled. Or do you believe that xfs_copy is faster then dd?
Or will the xfs_growfs create any problems?

I used dd on saturday to copy the 15TB xfs filesystem back
into the 20TB raid10 volume and enlarged the filesystem with
xfs_growfs. The result was a xfs-filesystem with layout-parameters
matching the temporary raid5 volume built from 16 1TB disks
with a 256K chunksize. But the new raid10-volume consists of
20 2TB disks using a chunksize of 512K. And growing the filesystem
raised the allocation group count from 32 to 45.

I reformatted the 20TB volume with a fresh xfs-filesystem and I
let mkfs.xfs decide about the layout.

Does that give me an optimal layout? I will enlarge the filesystem
in the future. This will increase my allocation group count. Is that
a problem that I should better have avoided in advance by reducing
the agcount?

Kind regards and thanks very much for the useful infose

Peter Koch

-- 
Peter Koch
Passauer Strasse 32, 47249 Duisburg
Tel.: 0172 2470263

[-- Attachment #1.2: Type: text/html, Size: 6158 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-05-31  9:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-27 22:39 xfsrestore performance xfs.pkoch
2016-05-28  9:25 ` xfs.pkoch
2016-05-28 13:23   ` Brian Foster
2016-05-30 12:52   ` Emmanuel Florac
2016-05-29 23:20 ` Dave Chinner
2016-05-31  9:39 xfs.pkoch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.