* Please hammer my for-linus branch
@ 2012-07-01 1:22 Chris Mason
2012-07-02 14:10 ` xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch) David Sterba
2012-07-02 20:17 ` Please hammer my for-linus branch Chris Mason
0 siblings, 2 replies; 6+ messages in thread
From: Chris Mason @ 2012-07-01 1:22 UTC (permalink / raw)
To: linux-btrfs
Hi everyone,
I've got a nice set of fixes from Josef, Jan, Ilya and others in my
for-linus branch:
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus
Some of the changes are fixes for the tree logging code, so I ran some
extra crash runs against them Friday night.
I ended up with a new crash in the tree log directory deletion replay
code, so I didn't send out the pull request to Linus.
It isn't clear yet if the new crash is because I was testing differently
or if it is a regression. I'm nailing it down this weekend, but please
give my for-linus a shot.
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch)
2012-07-01 1:22 Please hammer my for-linus branch Chris Mason
@ 2012-07-02 14:10 ` David Sterba
2012-07-02 14:34 ` David Sterba
2012-07-02 20:17 ` Please hammer my for-linus branch Chris Mason
1 sibling, 1 reply; 6+ messages in thread
From: David Sterba @ 2012-07-02 14:10 UTC (permalink / raw)
To: Chris Mason, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2359 bytes --]
Hi,
I'm seeing a machine lockup in xfstests/224, logs attached. Friday's
xfstests round with 3.5-rc4 was ok, all tests passed.
The 'dd' processes are in D-state with this stacktraces
5597 pts/0 D+ 0:00 dd status=noxfer if=/dev/zero of=/mnt/a2/testfile.8 bs=4k conv=notrunc
[<ffffffffa001bb3e>] reserve_metadata_bytes+0x33e/0x8f0 [btrfs]
[<ffffffffa001cd64>] btrfs_delalloc_reserve_metadata+0x134/0x3b0 [btrfs]
[<ffffffffa001d16b>] btrfs_delalloc_reserve_space+0x3b/0x60 [btrfs]
[<ffffffffa004132b>] __btrfs_buffered_write+0x17b/0x380 [btrfs]
[<ffffffffa0041783>] btrfs_file_aio_write+0x253/0x4e0 [btrfs]
[<ffffffff81144892>] do_sync_write+0xe2/0x120
[<ffffffff8114519e>] vfs_write+0xce/0x190
[<ffffffff811454e4>] sys_write+0x54/0xa0
[<ffffffff818b4fa9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
and (not sure if there are more)
5666 pts/0 D+ 0:00 dd status=noxfer if=/dev/zero of=/mnt/a2/testfile.6 bs=4k conv=notrunc
[<ffffffffa001bb3e>] reserve_metadata_bytes+0x33e/0x8f0 [btrfs]
[<ffffffffa001c56a>] btrfs_block_rsv_add+0x3a/0x60 [btrfs]
[<ffffffffa003155e>] start_transaction+0x26e/0x330 [btrfs]
[<ffffffffa0031903>] btrfs_start_transaction+0x13/0x20 [btrfs]
[<ffffffffa003cae0>] btrfs_dirty_inode+0xb0/0xe0 [btrfs]
[<ffffffffa003cdad>] btrfs_update_time+0xcd/0x180 [btrfs]
[<ffffffffa00416f8>] btrfs_file_aio_write+0x1c8/0x4e0 [btrfs]
[<ffffffff81144892>] do_sync_write+0xe2/0x120
[<ffffffff8114519e>] vfs_write+0xce/0x190
[<ffffffff811454e4>] sys_write+0x54/0xa0
[<ffffffff818b4fa9>] system_call_fastpath+0x16/0x1b
all btrfs kernel threads are idle.
Mount options: -o space_cache
Mkfs: fresh, default options
# btrfs fi df /mnt/a2
System: total=4.00MiB, used=4.00KiB
Data+Metadata: total=1020.00MiB, used=987.32MiB
[meanwhile]
While grabbing lockdep stats the test respawned
224 236s ... [14:57:42] [15:46:56] 2954s
but there was no disk activity, I wonder if touching /proc/lockdep or
/proc/lock_stat is affecting this.
Finishing this report anyway, and will redo the tests again.
Looking again into the logs, the first process snapshot (only D-state
processes) is much longer than process snapshot of containing all,
unfortuntelly I don't have timestamps recorded, but this suggests that it's
very slowly going on, so slowly that I considered it stalled looking at the
io graphs.
david
[-- Attachment #2: for-linus-hung-224-all.txt.gz --]
[-- Type: application/octet-stream, Size: 6081 bytes --]
[-- Attachment #3: for-linus-hung-224-D.txt.gz --]
[-- Type: application/octet-stream, Size: 5625 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch)
2012-07-02 14:10 ` xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch) David Sterba
@ 2012-07-02 14:34 ` David Sterba
2012-07-02 16:10 ` David Sterba
0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2012-07-02 14:34 UTC (permalink / raw)
To: Chris Mason, linux-btrfs
On Mon, Jul 02, 2012 at 04:10:52PM +0200, David Sterba wrote:
> Finishing this report anyway, and will redo the tests again.
>
> Looking again into the logs, the first process snapshot (only D-state
> processes) is much longer than process snapshot of containing all,
> unfortuntelly I don't have timestamps recorded, but this suggests that it's
> very slowly going on, so slowly that I considered it stalled looking at the
> io graphs.
Fresh build, reboot, and single xfstests/224 run:
during first ~20 seconds, there's high write activity, ie. file setup,
then it goes to a "few tens-to-hundreds of KB every 4 seconds". Cpu is idle,
sample output from dstat:
----total-cpu-usage---- --dsk/sda9- ---system--
usr sys idl wai hiq siq| read writ| int csw
1 1 99 0 0 0| 0 0 | 923 1856
0 1 98 0 1 0| 0 8192B| 904 2796
0 1 99 0 0 0| 0 0 | 945 1914
1 1 98 0 0 0| 0 0 | 899 1849
1 1 98 0 0 1| 0 0 | 906 1848
0 3 97 0 0 0| 0 20k| 901 3740
0 0 100 0 0 0| 0 0 | 905 1851
1 1 98 0 0 1| 0 0 | 946 1917
0 1 99 0 0 0| 0 0 | 904 1858
0 1 99 0 0 0| 0 8192B| 907 2805
1 1 98 0 0 1| 0 0 | 891 1836
0 1 99 0 0 0| 0 0 | 900 1847
0 1 99 0 0 0| 0 0 | 940 1905
1 4 95 0 0 0| 0 32k| 904 5153
1 2 97 0 0 0| 0 36k| 913 4240
0 1 99 0 0 0| 0 0 | 907 1849
0 1 99 0 0 0| 0 0 | 908 1852
1 1 98 0 0 1| 0 0 | 933 1901
1 2 98 0 0 0| 0 8192B| 916 2808
0 1 99 0 0 0| 0 0 | 917 1843
0 1 99 0 0 1| 0 0 | 908 1844
1 1 99 0 0 0| 0 0 | 905 1860
0 5 95 0 0 0| 0 36k| 943 7565
1 1 99 0 0 0| 0 0 | 911 1861
0 1 99 0 0 0| 0 0 | 910 1852
1 1 98 0 0 0| 0 0 | 944 1878
1 2 97 0 0 1| 0 16k| 898 3753
0 9 87 4 0 1| 0 1020k|1035 11k
0 19 74 7 0 1| 0 2092k|3052 24k
0 1 99 0 0 0| 0 0 | 909 1851
1 1 98 0 0 1| 0 0 | 915 1856
1 1 99 0 0 0| 0 0 | 896 1847
0 2 98 0 0 0| 0 8192B| 931 2847
0 1 99 0 0 0| 0 0 | 899 1850
1 1 98 0 0 1| 0 0 | 896 1861
0 1 99 0 0 0| 0 0 | 911 1855
1 5 94 0 0 0| 0 28k| 891 6521
0 9 87 3 0 1| 0 1100k| 963 11k
0 1 99 0 0 0| 0 0 | 905 1857
1 1 99 0 0 0| 0 0 | 895 1851
1 1 98 0 0 0| 0 0 | 911 1852
0 7 88 4 0 1| 0 700k| 911 8533
0 1 99 0 0 0| 0 0 | 940 1905
1 1 99 0 0 0| 0 0 | 912 1851
1 1 99 0 0 0| 0 0 | 895 1851
0 10 89 0 0 1| 0 100k| 912 13k
and repeats more or less the same.
Bisection in progress.
david
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch)
2012-07-02 14:34 ` David Sterba
@ 2012-07-02 16:10 ` David Sterba
0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2012-07-02 16:10 UTC (permalink / raw)
To: Chris Mason, linux-btrfs, jbacik
On Mon, Jul 02, 2012 at 04:34:53PM +0200, David Sterba wrote:
> Bisection in progress.
commit cae76522b19735c576803bec273f49062aa418ab
Author: Josef Bacik <jbacik@fusionio.com>
Date: Thu Jun 21 14:05:49 2012 -0400
Btrfs: flush delayed inodes if we're short on space
Those crazy gentoo guys have been complaining about ENOSPC errors on their
portage volumes. This is because doing things like untar tends to create
lots of new files which will soak up all the reservation space in the
delayed inodes. Usually this gets papered over by the fact that we will try
and commit the transaction, however if this happens in the wrong spot or we
choose not to commit the transaction you will be screwed. So add the
ability to expclitly flush delayed inodes to free up space. Please test
this out guys to make sure it works since as usual I cannot reproduce.
Thanks,
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Please hammer my for-linus branch
2012-07-01 1:22 Please hammer my for-linus branch Chris Mason
2012-07-02 14:10 ` xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch) David Sterba
@ 2012-07-02 20:17 ` Chris Mason
2012-07-03 14:39 ` David Sterba
1 sibling, 1 reply; 6+ messages in thread
From: Chris Mason @ 2012-07-02 20:17 UTC (permalink / raw)
To: linux-btrfs
On Sat, Jun 30, 2012 at 09:22:59PM -0400, Chris Mason wrote:
> Hi everyone,
>
> I've got a nice set of fixes from Josef, Jan, Ilya and others in my
> for-linus branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus
>
> Some of the changes are fixes for the tree logging code, so I ran some
> extra crash runs against them Friday night.
>
> I ended up with a new crash in the tree log directory deletion replay
> code, so I didn't send out the pull request to Linus.
>
> It isn't clear yet if the new crash is because I was testing differently
> or if it is a regression. I'm nailing it down this weekend, but please
> give my for-linus a shot.
Ok, I've just rebased for-linus. I've dropped Josef's enospc patch,
which should fix the regression Dave hit. I've also added a fix for my
log replay crash, which was definitely an old bug. The delayed
directory operations were queuing up the changes made during replay, and
it was confusing the replay code.
Looks like there's a fix pending from Liu Bo, but I'll let Daniel test
that before pulling it in as well.
Thanks everyone.
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Please hammer my for-linus branch
2012-07-02 20:17 ` Please hammer my for-linus branch Chris Mason
@ 2012-07-03 14:39 ` David Sterba
0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2012-07-03 14:39 UTC (permalink / raw)
To: Chris Mason, linux-btrfs
On Mon, Jul 02, 2012 at 04:17:37PM -0400, Chris Mason wrote:
> Ok, I've just rebased for-linus. I've dropped Josef's enospc patch,
> which should fix the regression Dave hit.
JFYI, fixed. No other problems observed so far.
david
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-07-03 14:39 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-01 1:22 Please hammer my for-linus branch Chris Mason
2012-07-02 14:10 ` xfstests/224 lockup/slowdown (was: Please hammer my for-linus branch) David Sterba
2012-07-02 14:34 ` David Sterba
2012-07-02 16:10 ` David Sterba
2012-07-02 20:17 ` Please hammer my for-linus branch Chris Mason
2012-07-03 14:39 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.