* bug: xfs_repair becomes very slow when file system has a large sparse file
@ 2011-08-19 16:37 Joe Landman
2011-08-20 0:26 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Joe Landman @ 2011-08-19 16:37 UTC (permalink / raw)
To: xfs
(If you prefer we file this on a bug reporting system, please let me
know where and I'll do this).
Scenario: xfs_repair being run against an about 17TB volume, containing
1 large sparse file. Logical size of 7 PB, actual size, a few hundred GB.
Metadata: Kernel = 2.6.32.41, 2.6.39.4, and others. Xfstools 3.1.5.
Hardware RAID ~17TB LUN. Base OS: Centos 5.6 + updates + updated xfs
tools + our kernels. Using external journal on a different device
What we observe:
Running xfs_repair
xfs_repair -l /dev/md2 -vv /dev/sdd2
the system gets to stage 3 and the first ag. Then it appears to stop.
After an hour or so, we strace it, and we see
pread(...) = 4096
occurring about 2-3 per second. An hour later, its down to 1 per
second. An hour after that, its once every 2 seconds.
Also, somewhere on this disk, someone has created an unfortunately large
file
[root@jr4-2 ~]# ls -alF /data/brick-sdd2/dht/scratch/xyzpdq
total 4652823496
d--------- 2 1232 1000 86 Jun 27 20:31 ./
drwx------ 104 1232 1000 65536 Aug 17 23:53 ../
-rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
-rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
-rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
-rw------- 1 1232 1000 7800416534233088 Jun 27 20:18 Gau-00000.rwf
[root@jr4-2 ~]# ls -ahlF /data/brick-sdd2/dht/scratch/xyzpdq
total 4.4T
d--------- 2 1232 1000 86 Jun 27 20:31 ./
drwx------ 104 1232 1000 64K Aug 17 23:53 ../
-rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
-rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
-rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
-rw------- 1 1232 1000 7.0P Jun 27 20:18 Gau-00000.rwf
This isn't a 7PB file system, its a 100TB file system across 3 machines,
roughly 17TB per brick or OSS. The Gau-00000.rwf is obviously a sparse
file, as could be seen with an ls -alsF
Upon removing that file, the xfs_repair completes within ~10 minutes.
Leaving that file on there, the xfs_repair does not terminate, it just
gets asymptotically slower.
I suspect it is looking for extents which are not there as part of the
repair.
Please let me know if you need more information, or if you would like me
to file this somewhere else for official reportage.
Regards,
Joe
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: bug: xfs_repair becomes very slow when file system has a large sparse file
2011-08-19 16:37 bug: xfs_repair becomes very slow when file system has a large sparse file Joe Landman
@ 2011-08-20 0:26 ` Dave Chinner
2011-08-20 0:38 ` Joe Landman
2012-01-10 17:32 ` Joe Landman
0 siblings, 2 replies; 6+ messages in thread
From: Dave Chinner @ 2011-08-20 0:26 UTC (permalink / raw)
To: Joe Landman; +Cc: xfs
On Fri, Aug 19, 2011 at 12:37:05PM -0400, Joe Landman wrote:
> (If you prefer we file this on a bug reporting system, please let me
> know where and I'll do this).
>
> Scenario: xfs_repair being run against an about 17TB volume,
> containing 1 large sparse file. Logical size of 7 PB, actual size,
> a few hundred GB.
>
> Metadata: Kernel = 2.6.32.41, 2.6.39.4, and others. Xfstools 3.1.5.
> Hardware RAID ~17TB LUN. Base OS: Centos 5.6 + updates + updated
> xfs tools + our kernels. Using external journal on a different
> device
>
> What we observe:
>
> Running xfs_repair
>
> xfs_repair -l /dev/md2 -vv /dev/sdd2
can you post the actual output of xfs_repair?
>
> the system gets to stage 3 and the first ag. Then it appears to
> stop. After an hour or so, we strace it, and we see
>
> pread(...) = 4096
and the same for the strace, along with syscall completion time?
(i.e. strace -ttt -T .....) That will tell us if the time is spend
doing IO or in the repair binary.
What is the CPU usage when this happens? How much memory do you
have? Is the machine swapping while it is slowing down? A couple of
minutes output of 'vmstat 5' when it is in this state would be handy.
> occurring about 2-3 per second. An hour later, its down to 1 per
> second. An hour after that, its once every 2 seconds.
>
> Also, somewhere on this disk, someone has created an unfortunately
> large file
>
> [root@jr4-2 ~]# ls -alF /data/brick-sdd2/dht/scratch/xyzpdq
> total 4652823496
> d--------- 2 1232 1000 86 Jun 27 20:31 ./
> drwx------ 104 1232 1000 65536 Aug 17 23:53 ../
> -rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
> -rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
> -rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
> -rw------- 1 1232 1000 7800416534233088 Jun 27 20:18 Gau-00000.rwf
>
> [root@jr4-2 ~]# ls -ahlF /data/brick-sdd2/dht/scratch/xyzpdq
> total 4.4T
> d--------- 2 1232 1000 86 Jun 27 20:31 ./
> drwx------ 104 1232 1000 64K Aug 17 23:53 ../
> -rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
> -rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
> -rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
> -rw------- 1 1232 1000 7.0P Jun 27 20:18 Gau-00000.rwf
>
> This isn't a 7PB file system, its a 100TB file system across 3
> machines, roughly 17TB per brick or OSS. The Gau-00000.rwf is
> obviously a sparse file, as could be seen with an ls -alsF
What does du tell you about it? xfs_io -f -c "stat" <large file>?
xfs_bmap -vp <large file>?
> Upon removing that file, the xfs_repair completes within ~10
> minutes. Leaving that file on there, the xfs_repair does not
> terminate, it just gets asymptotically slower.
That could simply be the memory footprint causing more swapping per
operation to occur. Or it could be that something is simply getting
too large for the index type being used. If the machine is not
swapping, can you point 'perf top -p <pid of xfs_repair>' at it so
we might see where that CPU time is being spent? (you might need to
use a non-stripped version of the binary to get any useful
information)
> Please let me know if you need more information, or if you would
> like me to file this somewhere else for official reportage.
This is the right place to let us know about problems.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: bug: xfs_repair becomes very slow when file system has a large sparse file
2011-08-20 0:26 ` Dave Chinner
@ 2011-08-20 0:38 ` Joe Landman
2011-08-20 2:00 ` Dave Chinner
2012-01-10 17:32 ` Joe Landman
1 sibling, 1 reply; 6+ messages in thread
From: Joe Landman @ 2011-08-20 0:38 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On 8/19/2011 8:26 PM, Dave Chinner wrote:
> On Fri, Aug 19, 2011 at 12:37:05PM -0400, Joe Landman wrote:
>> (If you prefer we file this on a bug reporting system, please let me
>> know where and I'll do this).
>>
>> Scenario: xfs_repair being run against an about 17TB volume,
>> containing 1 large sparse file. Logical size of 7 PB, actual size,
>> a few hundred GB.
>>
>> Metadata: Kernel = 2.6.32.41, 2.6.39.4, and others. Xfstools 3.1.5.
>> Hardware RAID ~17TB LUN. Base OS: Centos 5.6 + updates + updated
>> xfs tools + our kernels. Using external journal on a different
>> device
>>
>> What we observe:
>>
>> Running xfs_repair
>>
>> xfs_repair -l /dev/md2 -vv /dev/sdd2
>
> can you post the actual output of xfs_repair?
[root@jr4-2 ~]# xfs_repair -l /dev/md2 -vv /dev/sdd2
Phase 1 - find and verify superblock...
- max_mem = 37094400, icount = 1346752, imem = 5260, dblock =
4391112384, dmem = 2144097
- block cache size set to 4361880 entries
Phase 2 - using external log on /dev/md2
- zero log...
zero_log: head block 126232 tail block 126232
- scan filesystem freespace and inode maps...
agf_freeblks 11726908, counted 11726792 in ag 1
sb_ifree 2366, counted 2364
sb_fdblocks 2111548832, counted 2111548716
- found root inode chunk
libxfs_bcache: 0x8804c0
Max supported entries = 4361880
Max utilized entries = 4474
Active entries = 4474
Hash table size = 545235
Hits = 0
Misses = 4474
Hit ratio = 0.00
MRU 0 entries = 4474 (100%)
MRU 1 entries = 0 ( 0%)
MRU 2 entries = 0 ( 0%)
MRU 3 entries = 0 ( 0%)
MRU 4 entries = 0 ( 0%)
MRU 5 entries = 0 ( 0%)
MRU 6 entries = 0 ( 0%)
MRU 7 entries = 0 ( 0%)
MRU 8 entries = 0 ( 0%)
MRU 9 entries = 0 ( 0%)
MRU 10 entries = 0 ( 0%)
MRU 11 entries = 0 ( 0%)
MRU 12 entries = 0 ( 0%)
MRU 13 entries = 0 ( 0%)
MRU 14 entries = 0 ( 0%)
MRU 15 entries = 0 ( 0%)
Hash buckets with 0 entries 541170 ( 0%)
Hash buckets with 1 entries 3765 ( 84%)
Hash buckets with 2 entries 242 ( 10%)
Hash buckets with 3 entries 15 ( 1%)
Hash buckets with 4 entries 36 ( 3%)
Hash buckets with 5 entries 6 ( 0%)
Hash buckets with 6 entries 1 ( 0%)
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
bad magic number 0xc88 on inode 5034047
bad version number 0x40 on inode 5034047
bad inode format in inode 5034047
correcting nblocks for inode 5034046, was 185195 - counted 0
bad magic number 0xc88 on inode 5034047, resetting magic number
bad version number 0x40 on inode 5034047, resetting version number
bad inode format in inode 5034047
cleared inode 5034047
(then it hangs here)
>> the system gets to stage 3 and the first ag. Then it appears to
>> stop. After an hour or so, we strace it, and we see
>>
>> pread(...) = 4096
>
> and the same for the strace, along with syscall completion time?
> (i.e. strace -ttt -T .....) That will tell us if the time is spend
> doing IO or in the repair binary.
Unfortunately, we fixed the original system. We'll see if we can create
this condition in the lab, and then post this.
> What is the CPU usage when this happens? How much memory do you
Very low. The machine is effectively idle, user load of 0.01 or so.
48 GB ram in these machine
> have? Is the machine swapping while it is slowing down? A couple of
No. I've not seen xfs_repair use more than a few hundred MB on this problem
> minutes output of 'vmstat 5' when it is in this state would be handy.
>
>> occurring about 2-3 per second. An hour later, its down to 1 per
>> second. An hour after that, its once every 2 seconds.
>>
>> Also, somewhere on this disk, someone has created an unfortunately
>> large file
>>
>> [root@jr4-2 ~]# ls -alF /data/brick-sdd2/dht/scratch/xyzpdq
>> total 4652823496
>> d--------- 2 1232 1000 86 Jun 27 20:31 ./
>> drwx------ 104 1232 1000 65536 Aug 17 23:53 ../
>> -rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
>> -rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
>> -rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
>> -rw------- 1 1232 1000 7800416534233088 Jun 27 20:18 Gau-00000.rwf
>>
>> [root@jr4-2 ~]# ls -ahlF /data/brick-sdd2/dht/scratch/xyzpdq
>> total 4.4T
>> d--------- 2 1232 1000 86 Jun 27 20:31 ./
>> drwx------ 104 1232 1000 64K Aug 17 23:53 ../
>> -rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
>> -rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
>> -rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
>> -rw------- 1 1232 1000 7.0P Jun 27 20:18 Gau-00000.rwf
>>
>> This isn't a 7PB file system, its a 100TB file system across 3
>> machines, roughly 17TB per brick or OSS. The Gau-00000.rwf is
>> obviously a sparse file, as could be seen with an ls -alsF
>
> What does du tell you about it? xfs_io -f -c "stat"<large file>?
> xfs_bmap -vp<large file>?
ls -alsF told me it was a few hundred GB. Du gave a similar number.
>
>> Upon removing that file, the xfs_repair completes within ~10
>> minutes. Leaving that file on there, the xfs_repair does not
>> terminate, it just gets asymptotically slower.
>
> That could simply be the memory footprint causing more swapping per
> operation to occur. Or it could be that something is simply getting
> too large for the index type being used. If the machine is not
> swapping, can you point 'perf top -p<pid of xfs_repair>' at it so
> we might see where that CPU time is being spent? (you might need to
> use a non-stripped version of the binary to get any useful
> information)
Ok, we will try to recreate the condition in the lab and generate this.
>
>> Please let me know if you need more information, or if you would
>> like me to file this somewhere else for official reportage.
>
> This is the right place to let us know about problems.
Thanks!
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.,
email: landman@scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: bug: xfs_repair becomes very slow when file system has a large sparse file
2011-08-20 0:38 ` Joe Landman
@ 2011-08-20 2:00 ` Dave Chinner
2011-09-07 10:43 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2011-08-20 2:00 UTC (permalink / raw)
To: Joe Landman; +Cc: xfs
On Fri, Aug 19, 2011 at 08:38:53PM -0400, Joe Landman wrote:
> On 8/19/2011 8:26 PM, Dave Chinner wrote:
> >On Fri, Aug 19, 2011 at 12:37:05PM -0400, Joe Landman wrote:
> >>(If you prefer we file this on a bug reporting system, please let me
> >>know where and I'll do this).
> >>
> >>Scenario: xfs_repair being run against an about 17TB volume,
> >>containing 1 large sparse file. Logical size of 7 PB, actual size,
> >>a few hundred GB.
> >>
> >>Metadata: Kernel = 2.6.32.41, 2.6.39.4, and others. Xfstools 3.1.5.
> >>Hardware RAID ~17TB LUN. Base OS: Centos 5.6 + updates + updated
> >>xfs tools + our kernels. Using external journal on a different
> >>device
> >>
> >>What we observe:
> >>
> >>Running xfs_repair
> >>
> >> xfs_repair -l /dev/md2 -vv /dev/sdd2
> >
> >can you post the actual output of xfs_repair?
>
> [root@jr4-2 ~]# xfs_repair -l /dev/md2 -vv /dev/sdd2
> Phase 1 - find and verify superblock...
> - max_mem = 37094400, icount = 1346752, imem = 5260, dblock =
> 4391112384, dmem = 2144097
> - block cache size set to 4361880 entries
> Phase 2 - using external log on /dev/md2
> - zero log...
> zero_log: head block 126232 tail block 126232
> - scan filesystem freespace and inode maps...
> agf_freeblks 11726908, counted 11726792 in ag 1
> sb_ifree 2366, counted 2364
> sb_fdblocks 2111548832, counted 2111548716
> - found root inode chunk
> libxfs_bcache: 0x8804c0
> Max supported entries = 4361880
> Max utilized entries = 4474
> Active entries = 4474
> Hash table size = 545235
> Hits = 0
> Misses = 4474
> Hit ratio = 0.00
> MRU 0 entries = 4474 (100%)
> MRU 1 entries = 0 ( 0%)
> MRU 2 entries = 0 ( 0%)
> MRU 3 entries = 0 ( 0%)
> MRU 4 entries = 0 ( 0%)
> MRU 5 entries = 0 ( 0%)
> MRU 6 entries = 0 ( 0%)
> MRU 7 entries = 0 ( 0%)
> MRU 8 entries = 0 ( 0%)
> MRU 9 entries = 0 ( 0%)
> MRU 10 entries = 0 ( 0%)
> MRU 11 entries = 0 ( 0%)
> MRU 12 entries = 0 ( 0%)
> MRU 13 entries = 0 ( 0%)
> MRU 14 entries = 0 ( 0%)
> MRU 15 entries = 0 ( 0%)
> Hash buckets with 0 entries 541170 ( 0%)
> Hash buckets with 1 entries 3765 ( 84%)
> Hash buckets with 2 entries 242 ( 10%)
> Hash buckets with 3 entries 15 ( 1%)
> Hash buckets with 4 entries 36 ( 3%)
> Hash buckets with 5 entries 6 ( 0%)
> Hash buckets with 6 entries 1 ( 0%)
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> bad magic number 0xc88 on inode 5034047
> bad version number 0x40 on inode 5034047
> bad inode format in inode 5034047
> correcting nblocks for inode 5034046, was 185195 - counted 0
> bad magic number 0xc88 on inode 5034047, resetting magic number
> bad version number 0x40 on inode 5034047, resetting version number
> bad inode format in inode 5034047
> cleared inode 5034047
That doesn't look good - something has trashed an inode cluster by
the look of it. Was this why you ran xfs_repair?
FWIW, do you know what the inode number of the large file was? I'm
wondering if it was in the same cluster as the above inode and so
was corrupted in some way that cause repair to head off into lala
land....
> >What is the CPU usage when this happens? How much memory do you
>
> Very low. The machine is effectively idle, user load of 0.01 or so.
OK, so repair wasn't burning up an entire CPU walking/searching
lists?
> >>This isn't a 7PB file system, its a 100TB file system across 3
> >>machines, roughly 17TB per brick or OSS. The Gau-00000.rwf is
> >>obviously a sparse file, as could be seen with an ls -alsF
> >
> >What does du tell you about it? xfs_io -f -c "stat"<large file>?
> >xfs_bmap -vp<large file>?
>
> ls -alsF told me it was a few hundred GB. Du gave a similar number.
Ok - the other commands, however, tell me more than just the disk
blocks used - they also tell me how many extents the file has and
how they were laid out, which is what I really need to know about
that sparse file. It will also help me recreate a file with a
similar layout to see if xfs_repair chokes on it here, or whether it
was something specific to a corruption encountered....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: bug: xfs_repair becomes very slow when file system has a large sparse file
2011-08-20 2:00 ` Dave Chinner
@ 2011-09-07 10:43 ` Christoph Hellwig
0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2011-09-07 10:43 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs, Joe Landman
Joe,
do you have any of the updates Dave asked for?
On Sat, Aug 20, 2011 at 12:00:04PM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2011 at 08:38:53PM -0400, Joe Landman wrote:
> > On 8/19/2011 8:26 PM, Dave Chinner wrote:
> > >On Fri, Aug 19, 2011 at 12:37:05PM -0400, Joe Landman wrote:
> > >>(If you prefer we file this on a bug reporting system, please let me
> > >>know where and I'll do this).
> > >>
> > >>Scenario: xfs_repair being run against an about 17TB volume,
> > >>containing 1 large sparse file. Logical size of 7 PB, actual size,
> > >>a few hundred GB.
> > >>
> > >>Metadata: Kernel = 2.6.32.41, 2.6.39.4, and others. Xfstools 3.1.5.
> > >>Hardware RAID ~17TB LUN. Base OS: Centos 5.6 + updates + updated
> > >>xfs tools + our kernels. Using external journal on a different
> > >>device
> > >>
> > >>What we observe:
> > >>
> > >>Running xfs_repair
> > >>
> > >> xfs_repair -l /dev/md2 -vv /dev/sdd2
> > >
> > >can you post the actual output of xfs_repair?
> >
> > [root@jr4-2 ~]# xfs_repair -l /dev/md2 -vv /dev/sdd2
> > Phase 1 - find and verify superblock...
> > - max_mem = 37094400, icount = 1346752, imem = 5260, dblock =
> > 4391112384, dmem = 2144097
> > - block cache size set to 4361880 entries
> > Phase 2 - using external log on /dev/md2
> > - zero log...
> > zero_log: head block 126232 tail block 126232
> > - scan filesystem freespace and inode maps...
> > agf_freeblks 11726908, counted 11726792 in ag 1
> > sb_ifree 2366, counted 2364
> > sb_fdblocks 2111548832, counted 2111548716
> > - found root inode chunk
> > libxfs_bcache: 0x8804c0
> > Max supported entries = 4361880
> > Max utilized entries = 4474
> > Active entries = 4474
> > Hash table size = 545235
> > Hits = 0
> > Misses = 4474
> > Hit ratio = 0.00
> > MRU 0 entries = 4474 (100%)
> > MRU 1 entries = 0 ( 0%)
> > MRU 2 entries = 0 ( 0%)
> > MRU 3 entries = 0 ( 0%)
> > MRU 4 entries = 0 ( 0%)
> > MRU 5 entries = 0 ( 0%)
> > MRU 6 entries = 0 ( 0%)
> > MRU 7 entries = 0 ( 0%)
> > MRU 8 entries = 0 ( 0%)
> > MRU 9 entries = 0 ( 0%)
> > MRU 10 entries = 0 ( 0%)
> > MRU 11 entries = 0 ( 0%)
> > MRU 12 entries = 0 ( 0%)
> > MRU 13 entries = 0 ( 0%)
> > MRU 14 entries = 0 ( 0%)
> > MRU 15 entries = 0 ( 0%)
> > Hash buckets with 0 entries 541170 ( 0%)
> > Hash buckets with 1 entries 3765 ( 84%)
> > Hash buckets with 2 entries 242 ( 10%)
> > Hash buckets with 3 entries 15 ( 1%)
> > Hash buckets with 4 entries 36 ( 3%)
> > Hash buckets with 5 entries 6 ( 0%)
> > Hash buckets with 6 entries 1 ( 0%)
> > Phase 3 - for each AG...
> > - scan and clear agi unlinked lists...
> > - process known inodes and perform inode discovery...
> > - agno = 0
> > bad magic number 0xc88 on inode 5034047
> > bad version number 0x40 on inode 5034047
> > bad inode format in inode 5034047
> > correcting nblocks for inode 5034046, was 185195 - counted 0
> > bad magic number 0xc88 on inode 5034047, resetting magic number
> > bad version number 0x40 on inode 5034047, resetting version number
> > bad inode format in inode 5034047
> > cleared inode 5034047
>
> That doesn't look good - something has trashed an inode cluster by
> the look of it. Was this why you ran xfs_repair?
>
> FWIW, do you know what the inode number of the large file was? I'm
> wondering if it was in the same cluster as the above inode and so
> was corrupted in some way that cause repair to head off into lala
> land....
>
> > >What is the CPU usage when this happens? How much memory do you
> >
> > Very low. The machine is effectively idle, user load of 0.01 or so.
>
> OK, so repair wasn't burning up an entire CPU walking/searching
> lists?
>
> > >>This isn't a 7PB file system, its a 100TB file system across 3
> > >>machines, roughly 17TB per brick or OSS. The Gau-00000.rwf is
> > >>obviously a sparse file, as could be seen with an ls -alsF
> > >
> > >What does du tell you about it? xfs_io -f -c "stat"<large file>?
> > >xfs_bmap -vp<large file>?
> >
> > ls -alsF told me it was a few hundred GB. Du gave a similar number.
>
> Ok - the other commands, however, tell me more than just the disk
> blocks used - they also tell me how many extents the file has and
> how they were laid out, which is what I really need to know about
> that sparse file. It will also help me recreate a file with a
> similar layout to see if xfs_repair chokes on it here, or whether it
> was something specific to a corruption encountered....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
---end quoted text---
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: bug: xfs_repair becomes very slow when file system has a large sparse file
2011-08-20 0:26 ` Dave Chinner
2011-08-20 0:38 ` Joe Landman
@ 2012-01-10 17:32 ` Joe Landman
1 sibling, 0 replies; 6+ messages in thread
From: Joe Landman @ 2012-01-10 17:32 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
User just gave us permission to work on the system now, and we can be
destructive if needed. Took a few months, but we have a full
backup/snapshot which is now the primary, and this is an "extra" system
to be remissioned, but still exhibits the problem reported last August.
FWIW:
[root@jr4-4 ~]# df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/md2 33T 30T 3.0T 92% /data
On 08/19/2011 08:26 PM, Dave Chinner wrote:
> On Fri, Aug 19, 2011 at 12:37:05PM -0400, Joe Landman wrote:
>> (If you prefer we file this on a bug reporting system, please let me
>> know where and I'll do this).
>>
>> Scenario: xfs_repair being run against an about 17TB volume,
>> containing 1 large sparse file. Logical size of 7 PB, actual size,
>> a few hundred GB.
>>
>> Metadata: Kernel = 2.6.32.41, 2.6.39.4, and others. Xfstools 3.1.5.
>> Hardware RAID ~17TB LUN. Base OS: Centos 5.6 + updates + updated
>> xfs tools + our kernels. Using external journal on a different
>> device
>>
>> What we observe:
>>
>> Running xfs_repair
>>
>> xfs_repair -l /dev/md2 -vv /dev/sdd2
>
> can you post the actual output of xfs_repair?
>
Will do below.
>>
>> the system gets to stage 3 and the first ag. Then it appears to
>> stop. After an hour or so, we strace it, and we see
>>
>> pread(...) = 4096
>
> and the same for the strace, along with syscall completion time?
> (i.e. strace -ttt -T .....) That will tell us if the time is spend
> doing IO or in the repair binary.
>
Ok. Will do below as well.
> What is the CPU usage when this happens? How much memory do you
A single CPU filled.
> have? Is the machine swapping while it is slowing down? A couple of
No. Checked that.
> minutes output of 'vmstat 5' when it is in this state would be handy.
Ok, will append this as well.
>
>> occurring about 2-3 per second. An hour later, its down to 1 per
>> second. An hour after that, its once every 2 seconds.
>>
>> Also, somewhere on this disk, someone has created an unfortunately
>> large file
>>
>> [root@jr4-2 ~]# ls -alF /data/brick-sdd2/dht/scratch/xyzpdq
>> total 4652823496
>> d--------- 2 1232 1000 86 Jun 27 20:31 ./
>> drwx------ 104 1232 1000 65536 Aug 17 23:53 ../
>> -rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
>> -rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
>> -rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
>> -rw------- 1 1232 1000 7800416534233088 Jun 27 20:18 Gau-00000.rwf
>>
>> [root@jr4-2 ~]# ls -ahlF /data/brick-sdd2/dht/scratch/xyzpdq
>> total 4.4T
>> d--------- 2 1232 1000 86 Jun 27 20:31 ./
>> drwx------ 104 1232 1000 64K Aug 17 23:53 ../
>> -rw------- 1 1232 1000 21 Jun 27 09:57 Default.Route
>> -rw------- 1 1232 1000 250 Jun 27 09:57 Gau-00000.inp
>> -rw------- 1 1232 1000 0 Jun 27 09:57 Gau-00000.d2e
>> -rw------- 1 1232 1000 7.0P Jun 27 20:18 Gau-00000.rwf
>>
>> This isn't a 7PB file system, its a 100TB file system across 3
>> machines, roughly 17TB per brick or OSS. The Gau-00000.rwf is
>> obviously a sparse file, as could be seen with an ls -alsF
>
> What does du tell you about it? xfs_io -f -c "stat"<large file>?
> xfs_bmap -vp<large file>?
In the cases presented below, none of the files are sparse. I am
wondering if it simply happens for large files. I scanned the file
system for large files (greater than 10GB in size), and found quite a
few ... a number of which were at 250+ GB size.
>
>> Upon removing that file, the xfs_repair completes within ~10
>> minutes. Leaving that file on there, the xfs_repair does not
>> terminate, it just gets asymptotically slower.
I'd like to leave these in place until you tell me you need them to be
removed/looked at.
>
> That could simply be the memory footprint causing more swapping per
> operation to occur. Or it could be that something is simply getting
> too large for the index type being used. If the machine is not
> swapping, can you point 'perf top -p<pid of xfs_repair>' at it so
> we might see where that CPU time is being spent? (you might need to
> use a non-stripped version of the binary to get any useful
> information)
Ok. I'll generate that if needed.
>
>> Please let me know if you need more information, or if you would
>> like me to file this somewhere else for official reportage.
>
> This is the right place to let us know about problems.
>
Separating with 8x========== on top and ==========x8 on bottom, with
additional meta-data below the 8x ... portion
8x========== xfs_repair output for
# nohup xfs_repair -m 16384 -vvv -l /dev/md10 /dev/md2 > \
# repair.out 2>&1 &
[root@jr4-4 ~]# cat repair.out
Phase 1 - find and verify superblock...
- max_mem = 16777216, icount = 15540800, imem = 60706, dblock =
8782223872, dmem = 4288195
- block cache size set to 1547288 entries
Phase 2 - using external log on /dev/md10
- zero log...
zero_log: head block 2 tail block 2
- scan filesystem freespace and inode maps...
- found root inode chunk
libxfs_bcache: 0x8804c0
Max supported entries = 1547288
Max utilized entries = 11850
Active entries = 11850
Hash table size = 193411
Hits = 0
Misses = 11850
Hit ratio = 0.00
MRU 0 entries = 11850 (100%)
MRU 1 entries = 0 ( 0%)
MRU 2 entries = 0 ( 0%)
MRU 3 entries = 0 ( 0%)
MRU 4 entries = 0 ( 0%)
MRU 5 entries = 0 ( 0%)
MRU 6 entries = 0 ( 0%)
MRU 7 entries = 0 ( 0%)
MRU 8 entries = 0 ( 0%)
MRU 9 entries = 0 ( 0%)
MRU 10 entries = 0 ( 0%)
MRU 11 entries = 0 ( 0%)
MRU 12 entries = 0 ( 0%)
MRU 13 entries = 0 ( 0%)
MRU 14 entries = 0 ( 0%)
MRU 15 entries = 0 ( 0%)
Hash buckets with 0 entries 182822 ( 0%)
Hash buckets with 1 entries 9785 ( 82%)
Hash buckets with 2 entries 530 ( 8%)
Hash buckets with 3 entries 108 ( 2%)
Hash buckets with 4 entries 155 ( 5%)
Hash buckets with 5 entries 9 ( 0%)
Hash buckets with 8 entries 2 ( 0%)
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
# stuck here for more than 1 hour now
==========x8
8x========== top output
#
[root@jr4-4 ~]# top
top - 11:30:27 up 160 days, 13:42, 2 users, load average: 0.87, 0.86, 0.87
Tasks: 384 total, 2 running, 382 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.8%sy, 0.0%ni, 98.2%id, 0.8%wa, 0.0%hi, 0.2%si,
0.0%st
Mem: 49459860k total, 15226800k used, 34233060k free, 380k buffers
Swap: 13670752k total, 0k used, 13670752k free, 9612k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16623 root 20 0 15.6g 14g 908 R 99.9 30.0 121:00.98 xfs_repair
17099 root 20 0 12880 1232 732 R 2.0 0.0 0:00.02 top
1 root 20 0 10360 636 520 S 0.0 0.0 1:20.24 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 3:44.51
migration/0
4 root 20 0 0 0 0 S 0.0 0.0 88:34.79
ksoftirqd/0
==========x8
8x========== vmstat 5 output for
# nohup vmstat 5 > vmstat.out 2>&1 &
# started the xfs_repair after starting this, so this is the initial
# data, subsequent data after the ... where xfs_repair appears to be
# mired down.
[root@jr4-4 ~]# tail -f vmstat.out
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy
id wa st
0 0 0 49093752 868 11020 0 0 744 249 0 0 0
1 98 1 0
0 0 0 49094260 868 11020 0 0 0 4 1616 264 0
0 100 0 0
0 0 0 49094384 876 11020 0 0 0 8 1614 262 0
0 100 0 0
0 0 0 49094320 876 11020 0 0 0 0 1614 271 0
0 100 0 0
0 0 0 49094332 884 11020 0 0 0 10 1619 275 0
0 100 0 0
1 0 0 45504692 884 11020 0 0 647582 24578 57587 68969
8 4 84 3 0
1 1 0 45481944 892 11020 0 0 2436 9 2232 1490 6
0 94 0 0
1 0 0 45478984 892 11020 0 0 306 7 1698 423 6
0 94 0 0
1 0 0 45477620 900 11020 0 0 134 10 1656 344 6
0 94 0 0
1 0 0 45476380 900 11020 0 0 106 0 1643 322 6
0 94 0 0
1 0 0 45475264 908 11020 0 0 90 6 1642 314 6
0 94 0 0
1 0 0 45474396 908 11020 0 0 76 4 1638 308 6
0 94 0 0
1 0 0 45473528 916 11020 0 0 74 6 1639 308 6
0 94 0 0
1 0 0 45472784 916 11020 0 0 66 0 1634 301 6
0 94 0 0
0 1 0 45471544 924 11020 0 0 174 10 1666 359 6
0 94 1 0
1 0 0 45470428 924 11020 0 0 179 4 1667 364 4
0 94 2 0
1 0 0 45469684 932 11020 0 0 114 6 1648 327 5
0 94 2 0
1 0 0 45469188 932 11020 0 0 94 0 1639 315 5
0 94 1 0
1 0 0 45468692 940 11020 0 0 85 6 1640 312 5
0 94 1 0
0 1 0 45467328 940 11020 0 0 246 0 1677 391 4
0 94 3 0
1 0 0 45466584 948 11020 0 0 135 6 1653 337 5
0 94 2 0
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy
id wa st
1 0 0 45465964 952 11016 0 0 101 2 1641 318 5
0 94 1 0
1 0 0 45465468 956 11020 0 0 88 7 1644 314 5
0 94 1 0
1 0 0 45464972 964 11020 0 0 80 6 1640 313 5
0 94 1 0
1 4 0 43957340 964 11020 0 0 251558 0 14506 21971
4 1 87 8 0
0 1 0 43732652 972 11020 0 0 75435 6 2454 1011 1
0 90 9 0
0 1 0 43728312 972 11020 0 0 355 0 1704 443 1
0 94 5 0
0 1 0 43724220 980 11020 0 0 310 6 1697 431 1
0 94 5 0
0 1 0 43721244 980 11020 0 0 315 0 1694 423 0
0 94 6 0
0 1 0 43719508 988 11020 0 0 325 11 1703 438 1
0 94 5 0
0 5 0 43475228 988 11020 0 0 7897 4 5164 2114 2
0 92 5 0
0 1 0 43239876 996 11020 0 0 81598 6 2387 967 1
0 90 9 0
0 1 0 43234048 996 11020 0 0 378 0 1710 455 1
0 94 6 0
0 1 0 42899620 1004 11020 0 0 63013 6 3309 3314 2
0 93 5 0
0 5 0 42643932 1004 11020 0 0 48565 0 2664 2174 1
0 92 7 0
1 0 0 42390972 1012 11020 0 0 47660 6 2905 1949 1
0 92 6 0
1 0 0 42389236 1012 11020 0 0 237 4 1679 388 4
0 94 2 0
0 1 0 42373472 324 9724 0 0 2662 34 1786 584 1
0 94 5 0
0 1 0 42367004 348 9740 0 0 333 10 1704 445 0
0 94 6 0
0 1 0 42362664 348 9740 0 0 326 0 1698 435 0
0 94 6 0
0 1 0 42358944 356 9740 0 0 326 6 1702 434 0
0 94 6 0
0 1 0 42354728 356 9740 0 0 758 0 1705 437 0
0 94 6 0
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy
id wa st
1 4 0 42313932 364 9740 0 0 6844 6 2211 765 0
0 93 6 0
0 5 0 41600312 364 9740 0 0 126161 6 8847 7009 2
1 89 8 0
1 0 0 40934052 372 9744 0 0 141561 20 16966 12097
3 1 88 8 0
0 1 0 40546056 372 9744 0 0 75141 0 4493 3544 2
0 91 7 0
0 5 0 40097548 380 9744 0 0 83139 6 11961 5714 2
1 89 8 0
1 4 0 39235500 380 9744 0 0 162830 0 21207 15579
3 1 88 9 0
1 4 0 36952660 388 9744 0 0 470646 6 73141 48593
6 3 77 15 0
0 5 0 36735288 400 9744 0 0 41452 0 6851 2697 1
1 86 13 0
3 3 0 35785820 408 9744 0 0 134878 10 33292 18313
3 1 85 10 0
1 4 0 35285976 408 9744 0 0 171517 0 9015 5008 1
1 85 13 0
0 1 0 35154040 416 9744 0 0 336041 10 4593 2815 2
1 86 12 0
0 1 0 35145980 416 9744 0 0 309 0 1770 426 0
0 94 6 0
0 1 0 35141268 424 9744 0 0 279 6 1758 412 3
0 94 3 0
0 1 0 35140028 424 9744 0 0 198 0 1715 366 3
0 94 3 0
1 0 0 35138540 432 9744 0 0 240 6 1740 391 3
0 94 3 0
1 0 0 35135688 432 9744 0 0 232 4 1733 388 3
0 94 3 0
0 1 0 35130480 440 9744 0 0 352 6 1795 446 1
0 94 5 0
0 1 0 35127628 444 9744 0 0 375 0 1802 455 0
0 94 6 0
0 1 0 35118576 452 9744 0 0 346 10 1792 446 2
0 94 4 0
0 1 0 35114484 452 9744 0 0 514 0 1872 528 0
0 94 6 0
0 1 0 35111260 460 9744 0 0 329 6 1782 434 0
0 94 6 0
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy
id wa st
0 1 0 35108408 464 9740 0 0 338 2 1785 438 1
0 94 6 0
0 1 0 35105432 468 9744 0 0 361 7 1800 451 1
0 94 6 0
0 1 0 35102828 476 9744 0 0 326 6 1783 439 0
0 94 6 0
0 1 0 35099852 476 9744 0 0 365 0 1797 450 0
0 94 6 0
0 1 0 35097000 484 9744 0 0 362 6 1800 453 0
0 94 6 0
0 1 0 35094272 484 9744 0 0 371 0 1801 456 1
0 94 6 0
0 1 0 35091172 492 9744 0 0 355 6 1798 452 0
0 94 6 0
0 1 0 35088320 492 9744 0 0 355 0 1792 444 1
0 94 6 0
0 1 0 35085716 500 9744 0 0 346 10 1794 447 1
0 94 6 0
0 1 0 35083112 500 9744 0 0 344 0 1786 441 1
0 94 6 0
0 1 0 35080136 508 9744 0 0 341 6 1789 445 0
0 94 6 0
1 0 0 35077408 508 9744 0 0 345 0 1787 440 0
0 94 6 0
0 1 0 35074556 516 9744 0 0 372 10 1810 462 0
0 94 6 0
0 1 0 35071704 516 9744 0 0 338 0 1785 437 0
0 94 6 0
0 1 0 35069224 524 9744 0 0 306 6 1771 427 1
0 94 6 0
1 0 0 35066992 524 9744 0 0 305 4 1769 423 1
0 94 6 0
0 5 0 35026196 532 9744 0 0 127340 6 3111 3127 2
0 89 9 0
0 1 0 35013052 540 9744 0 0 113392 6 2554 2902 2
0 90 8 0
0 1 0 35010200 540 9744 0 0 16826 0 1854 891 1
0 94 6 0
0 1 0 35005860 548 9744 0 0 28107 6 2006 1215 0
0 93 7 0
0 5 0 34971388 548 9744 0 0 82062 4 3487 4203 2
0 87 10 0
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy
id wa st
0 5 0 34900836 556 9744 0 0 395478 433 4399 3190 3
1 83 14 0
0 1 0 34876656 556 9744 0 0 40074 4 2049 765 1
0 91 8 0
0 1 0 34872564 564 9744 0 0 300 10 1696 423 1
0 94 5 0
0 1 0 34869836 564 9744 0 0 334 0 1699 437 1
0 94 6 0
0 1 0 34867480 572 9744 0 0 329 6 1700 434 1
0 94 5 0
0 1 0 34865000 572 9748 0 0 311 0 1693 424 0
0 94 6 0
... (after 30 minutes)
1 0 0 34250164 540 9492 0 0 3 8 1619 285 6 0
94 0 0
1 0 0 34249920 548 9492 0 0 4 4 1618 288 6
0 94 0 0
1 0 0 34249920 548 9500 0 0 5 2 1617 289 6
0 94 0 0
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy
id wa st
1 0 0 34249920 556 9500 0 0 4 14 1624 293 6
0 94 0 0
1 0 0 34249796 564 9504 0 0 3 8 1620 286 6
0 94 0 0
1 0 0 34249800 572 9504 0 0 4 6 1619 290 6
0 94 0 0
1 0 0 34249676 580 9504 0 0 5 6 1622 293 6
0 94 0 0
1 0 0 34255388 588 9508 0 0 4 6 1618 289 6
0 94 0 0
1 0 0 34255388 608 9500 0 0 43 9 1624 304 6
0 94 0 0
1 0 0 34255080 616 9692 0 0 4 6 1619 289 6
0 94 0 0
1 0 0 34255080 624 9692 0 0 4 16 1623 288 6
0 94 0 0
1 0 0 34254956 632 9688 0 0 4 8 1619 286 6
0 94 0 0
1 0 0 34254836 640 9692 0 0 3 4 1617 281 6
0 94 0 0
1 0 0 34254836 640 9696 0 0 4 2 1617 285 6
0 94 0 0
1 0 0 34254836 648 9696 0 0 3 10 1620 286 6
0 94 0 0
1 0 0 34254712 656 9700 0 0 5 8 1620 290 6
0 94 0 0
1 0 0 34254716 664 9700 0 0 4 6 1619 286 6
0 94 0 0
==========x8
8x========== strace output for
# nohup strace -p 16623 -ttt -T > strace.out 2>&1 &
1326216486.322990 pread(4,
"BMAP\0\0\0\376\0\0\0\0r\\\310@\0\0\0\0\v\367\342\271\0\0\0BQ\33j\0"...,
4096, 7859072671744) = 4096 <0.010081>
1326216487.813946 mprotect(0x7fe40cfe7000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000028>
1326216487.814068 pread(4,
"BMAP\0\0\0\260\0\0\0\0r]T\242\0\0\0\0r]\323O\0\0\0BQ\270f\0"..., 4096,
822455537664) = 4096 <0.017904>
1326216488.866388 mprotect(0x7fe40cfe8000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000023>
1326216488.866519 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\367\342\271\0\0\0\0r]t\317\0\0\0BR\35\350\0"...,
4096, 7859205500928) = 4096 <0.009517>
1326216490.356207 mprotect(0x7fe40cfe9000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216490.356330 pread(4,
"BMAP\0\0\0\376\0\0\0\0r]\323O\0\0\0\0ri\357\305\0\0\0BS\301\300\0"...,
4096, 7859106410496) = 4096 <0.010006>
1326216491.846474 mprotect(0x7fe40cfea000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216491.846590 pread(4,
"BMAP\0\0\0\201\0\0\0\0r]t\317\0\0\0\0rj'\7\0\0\0Bl\321j\0"..., 4096,
7862456569856) = 4096 <0.012849>
1326216492.611395 mprotect(0x7fe40cfeb000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216492.611518 pread(4,
"BMAP\0\0\0\200\0\0\0\0ri\357\305\0\0\0\0\v\371R\374\0\0\0Bm\236f\0"...,
4096, 7862514511872) = 4096 <0.011286>
1326216493.369264 mprotect(0x7fe40cfec000, 8192, PROT_READ|PROT_WRITE) =
0 <0.000028>
1326216493.369388 pread(4,
"BMAP\0\0\0\376\0\0\0\0rj'\7\0\0\0\0\v\371\220\230\0\0\0Bm\351N\0"...,
4096, 822841688064) = 4096 <0.014758>
1326216494.873466 mprotect(0x7fe40cfee000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216494.873586 pread(4,
"BMAP\0\0\0\223\0\0\0\0\v\371R\374\0\0\0\0\v\372#\237\0\0\0BnqT\0"...,
4096, 822906290176) = 4096 <0.011972>
1326216495.742900 mprotect(0x7fe40cfef000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216495.743018 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\371\220\230\0\0\0\0\v\371\315\370\0\0\0Bn\316V\0"...,
4096, 823060459520) = 4096 <0.014324>
1326216497.238220 mprotect(0x7fe40cff0000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216497.238347 pread(4,
"BMAP\0\0\0\204\0\0\0\0\v\372#\237\0\0\0\0\v\372\357I\0\0\0Bom \0"...,
4096, 822970646528) = 4096 <0.014281>
1326216498.022466 mprotect(0x7fe40cff1000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216498.022584 pread(4,
"BMAP\0\1\0\376\0\0\0\0r\21N=\0\0\0\0\f\200+\217\0\0\0\0!7\347S"...,
4096, 823104323584) = 4096 <0.011951>
1326216498.034632 mprotect(0x7fe40cff2000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000023>
1326216498.034712 pread(4,
"BMAP\0\0\0\210\0\0\0\0\v\371\315\370\0\0\0\0\v\372Ms\0\0\0Bo\316\246\0"...,
4096, 823274016768) = 4096 <0.015055>
1326216498.843386 mprotect(0x7fe40cff3000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216498.843504 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\372\357I\0\0\0\0\v\372\225c\0\0\0Bp'4\0"...,
4096, 823104319488) = 4096 <0.015296>
1326216500.339933 mprotect(0x7fe40cff4000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000022>
1326216500.340026 pread(4,
"BMAP\0\0\0\201\0\0\0\0\v\372Ms\0\0\0\0rjU\251\0\0\0Bp\312\24\0"...,
4096, 823179751424) = 4096 <0.009657>
1326216501.101929 mprotect(0x7fe40cff5000, 8192, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216501.102049 pread(4,
"BMAP\0\0\0\201\0\0\0\0\v\372\225c\0\0\0\0\v\372\320<\0\0\0Bq\23\244\0"...,
4096, 7862563409920) = 4096 <0.016088>
1326216501.871373 mprotect(0x7fe40cff7000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216501.871494 pread(4,
"BMAP\0\0\0\376\0\0\0\0rjU\251\0\0\0\0\v\373\20\341\0\0\0BqG\200\0"...,
4096, 823241457664) = 4096 <0.010275>
1326216503.364635 mprotect(0x7fe40cff8000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216503.364755 pread(4,
"BMAP\0\0\0\200\0\0\0\0\v\372\320<\0\0\0\0\v\373\355\360\0\0\0Bq\303\242\0"...,
4096, 823309242368) = 4096 <0.016705>
1326216504.128107 mprotect(0x7fe40cff9000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000028>
1326216504.128232 pread(4,
"BMAP\0\0\0\236\0\0\0\0\v\373\20\341\0\0\0\0\v\373f\25\0\0\0Br\37D\0"..., 4096,
823541039104) = 4096 <0.017153>
1326216505.076279 mprotect(0x7fe40cffa000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216505.076395 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\373\355\360\0\0\0\0\v\373\246i\0\0\0Br|\246\0"...,
4096, 823398584320) = 4096 <0.017912>
1326216506.576563 mprotect(0x7fe40cffb000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216506.576682 pread(4,
"BMAP\0\0\0\220\0\0\0\0\v\373f\25\0\0\0\0\v\374\1774\0\0\0Bs!H\0"...,
4096, 823466037248) = 4096 <0.011574>
1326216507.429031 mprotect(0x7fe40cffc000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216507.429174 pread(4,
"BMAP\0\0\0\303\0\0\0\0\v\373\246i\0\0\0\0\v\373\357/\0\0\0Bs~\346\0"..., 4096,
823693361152) = 4096 <0.014163>
1326216508.580628 mprotect(0x7fe40cffd000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216508.580747 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\374\1774\0\0\0\0\v\3746\372\0\0\0Bs\361P\0"..., 4096,
823542345728) = 4096 <0.018541>
1326216510.081415 mprotect(0x7fe40cffe000, 8192, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216510.081540 pread(4,
"BMAP\0\0\0\235\0\0\0\0\v\373\357/\0\0\0\0\v\375&\346\0\0\0Bt\212b\0"..., 4096,
823617626112) = 4096 <0.014412>
1326216511.012066 mprotect(0x7fe40d000000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216511.012190 pread(4,
"BMAP\0\0\0\313\0\0\0\0\v\3746\372\0\0\0\0\v\374t\263\0\0\0Bt\312\24\0"...,
4096, 823869202432) = 4096 <0.011888>
1326216512.208978 mprotect(0x7fe40d001000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216512.209100 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\375&\346\0\0\0\0\v\374\304\346\0\0\0Bu9,\0"..., 4096,
823682347008) = 4096 <0.015123>
1326216513.706751 mprotect(0x7fe40d002000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216513.706873 pread(4,
"BMAP\0\0\0\237\0\0\0\0\v\374t\263\0\0\0\0\v\375\255\0\0\0\0Bu\257\354\0"...,
4096, 823766441984) = 4096 <0.015348>
1326216514.650434 mprotect(0x7fe40d003000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000027>
1326216514.650554 pread(4,
"BMAP\0\0\0\271\0\0\0\0\v\374\304\346\0\0\0\0\v\375\6\263\0\0\0Bv\f\202\0"...,
4096, 824009818112) = 4096 <0.008932>
1326216515.739710 mprotect(0x7fe40d004000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216515.739834 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\375\255\0\0\0\0\0\211f\332_\0\0\0Bvr\324\0"..., 4096,
823835439104) = 4096 <0.019711>
1326216517.242497 mprotect(0x7fe40d005000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216517.242618 pread(4,
"BMAP\0\0\0\214\0\0\0\0\v\375\6\263\0\0\0\0\v\376i\317\0\0\0Bw\32\320\0"...,
4096, 9442176659456) = 4096 <0.012487>
1326216518.072182 mprotect(0x7fe40d006000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216518.072303 pread(4,
"BMAP\0\0\0\253\0\0\0\0\211f\332_\0\0\0\0\215\30W\303\0\0\0BwT\254\0"..., 4096,
824207798272) = 4096 <0.013816>
1326216519.084891 mprotect(0x7fe40d007000, 8192, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216519.085008 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\376i\317\0\0\0\0\v\375\352p\0\0\0Bw\262d\0"..., 4096,
9695979646976) = 4096 <0.014198>
1326216520.581764 mprotect(0x7fe40d009000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216520.581881 pread(4,
"BMAP\0\0\0\177\0\0\0\0\215\30W\303\0\0\0\0rj\230 \0\0\0Bxa\266\0"...,
4096, 824074240000) = 4096 <0.012057>
1326216521.335842 mprotect(0x7fe40d00a000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216521.335962 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\375\352p\0\0\0\0\v\376F\310\0\0\0Bx\253\24\0"...,
4096, 7862633103360) = 4096 <0.012886>
1326216522.832068 mprotect(0x7fe40d00b000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216522.832189 pread(4, "BMAP\0\0\0\376\0\0\0\0rj\230
\0\0\0\0\216\261\21\313\0\0\0By\351\0\0"..., 4096, 824171069440) = 4096
<0.015956>
1326216524.331849 mprotect(0x7fe40d00c000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216524.331975 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\376F\310\0\0\0\0\v\377\210\237\0\0\0B{\237\300\0"...,
4096, 9805696380928) = 4096 <0.012498>
1326216525.828539 mprotect(0x7fe40d00d000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000029>
1326216525.828659 pread(4,
"BMAP\0\0\0\266\0\0\0\0\216\261\21\313\0\0\0\0rp2:\0\0\0B\206\243:\0"..., 4096,
824508542976) = 4096 <0.005462>
1326216526.897970 mprotect(0x7fe40d00e000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216526.898094 pread(4,
"BMAP\0\0\0\376\0\0\0\0\v\377\210\237\0\0\0\0\f\21.7\0\0\0B\246\324\"\0"...,
4096, 7864136867840) = 4096 <0.009157>
1326216528.390700 mprotect(0x7fe40d00f000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000028>
1326216528.390822 pread(4,
"BMAP\0\0\0\203\0\0\0\0rp2:\0\0\0\0\f\22\240\370\0\0\0B\252\332\22\0"..., 4096,
829245583360) = 4096 <0.014973>
1326216529.171630 mprotect(0x7fe40d010000, 8192, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216529.171750 pread(4,
"BMAP\0\0\0\234\0\0\0\0\f\21.7\0\0\0\0\f\22%?\0\0\0B\253\2174\0"...,
4096, 829634347008) = 4096 <0.012732>
1326216530.095737 mprotect(0x7fe40d012000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216530.095856 pread(4,
"BMAP\0\0\0\376\0\0\0\0\f\22\240\370\0\0\0\0\f\23\243\232\0\0\0B\254g\360\0"...,
4096, 829504614400) = 4096 <0.017642>
1326216531.597565 mprotect(0x7fe40d013000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000031>
1326216531.597683 pread(4,
"BMAP\0\0\0\261\0\0\0\0\f\22%?\0\0\0\0\f\23\301\353\0\0\0B\255\263\212\0"...,
4096, 829905543168) = 4096 <0.012275>
1326216532.644040 mprotect(0x7fe40d014000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216532.644162 pread(4,
"BMAP\0\0\0\210\0\0\0\0\f\23\243\232\0\0\0\0\f\22\374\322\0\0\0B\256\202\24\0"...,
4096, 829937332224) = 4096 <0.011165>
1326216533.450555 mprotect(0x7fe40d015000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000027>
1326216533.450677 pread(4,
"BMAP\0\0\0\376\0\0\0\0\f\23\301\353\0\0\0\0\f\24>\245\0\0\0B\257]\6\0"...,
4096, 829730660352) = 4096 <0.011231>
1326216534.946533 mprotect(0x7fe40d016000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216534.946653 pread(4,
"BMAP\0\0\0\264\0\0\0\0\f\22\374\322\0\0\0\0\f\26\230!\0\0\0B\261\226\330\0"...,
4096, 830068117504) = 4096 <0.011953>
1326216536.010196 mprotect(0x7fe40d017000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000028>
1326216536.010308 pread(4,
"BMAP\0\0\0\376\0\0\0\0\f\24>\245\0\0\0\0\f\26<\246\0\0\0B\262\371R\0"...,
4096, 830698819584) = 4096 <0.014615>
1326216537.509542 mprotect(0x7fe40d018000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216537.509665 pread(4,
"BMAP\0\0\0\177\0\0\0\0\f\26\230!\0\0\0\0\f\30\32\36\0\0\0B\264\2260\0"...,
4096, 830602895360) = 4096 <0.016656>
1326216538.268615 mprotect(0x7fe40d019000, 8192, PROT_READ|PROT_WRITE) =
0 <0.000033>
1326216538.268743 pread(4,
"BMAP\0\0\0\376\0\0\0\0\f\26<\246\0\0\0\0\f\0306\336\0\0\0B\265f\n\0"..., 4096,
831103557632) = 4096 <0.015820>
1326216539.777465 mprotect(0x7fe40d01b000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000032>
1326216539.777582 pread(4,
"BMAP\0\0\0\220\0\0\0\0\f\30\32\36\0\0\0\0\f\34\312\215\0\0\0B\267C^\0"...,
4096, 831133704192) = 4096 <0.018219>
1326216540.637506 mprotect(0x7fe40d01c000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216540.637626 pread(4,
"BMAP\0\0\0\311\0\0\0\0\f\0306\336\0\0\0\0rp\216\374\0\0\0B\270sV\0"...,
4096, 832362303488) = 4096 <0.011067>
1326216541.823721 mprotect(0x7fe40d01d000, 4096, PROT_READ|PROT_WRITE) =
0 <0.000030>
1326216541.823839 pread(4,
"BMAP\0\0\0\376\0\0\0\0\f\34\312\215\0\0\0\0rp\302F\0\0\0B\271\320,\0"...,
4096, 7864234131456) = 4096 <0.015157>
==========x8
[root@jr4-4 ~]# uname -r
2.6.32.41.scalable
Let me know if you need anything else.
> Cheers,
>
> Dave.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-01-10 17:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-19 16:37 bug: xfs_repair becomes very slow when file system has a large sparse file Joe Landman
2011-08-20 0:26 ` Dave Chinner
2011-08-20 0:38 ` Joe Landman
2011-08-20 2:00 ` Dave Chinner
2011-09-07 10:43 ` Christoph Hellwig
2012-01-10 17:32 ` Joe Landman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.