* Re: unexpected results from simple test?
2009-06-22 10:41 unexpected results from simple test? Richard Kennedy
@ 2009-06-22 10:40 ` Gurudas Pai
2009-06-22 10:53 ` Jens Axboe
2009-06-22 10:51 ` Jens Axboe
1 sibling, 1 reply; 9+ messages in thread
From: Gurudas Pai @ 2009-06-22 10:40 UTC (permalink / raw)
To: Richard Kennedy; +Cc: fio
Richard Kennedy wrote:
> I been running a hopefully simple fio test, but I don't understand the
> results.
>
> the test :-
>
> fio --name=f1 --directory=/sda --rw=rw --size=400m
> --name=f2 --directory=/sdb --rw=write --size=1g --startdelay=10
>
> on linux 2.6.30-git
>
> running this takes quite while, 'time fio ...' gives
> real 3m49.174s
> user 0m0.838s
> sys 0m7.990s
>
> (edited highlights of the fio output)
> f1: (groupid=0, jobs=1): err= 0: pid=5181
> read : io=200MiB, bw=23,480KiB/s, iops=5,870, runt= 8722msec
> write: io=200MiB, bw=23,499KiB/s, iops=5,874, runt= 8715msec
> f2: (groupid=0, jobs=1): err= 0: pid=5182
> write: io=1,024MiB, bw=77,231KiB/s, iops=19,307, runt= 13577msec
>
> The jobs seem to finish reasonable quickly, but then fio seem to be
> stuck waiting for something. CPU usage is 100% but the CPU frequency
> stays at the lowest level and the system is really sluggish in its
> response. It does recover in time but this test does seem to torture my
> system.
>
> strace shows fio repeating this :-
> open and read several disk stats, such as
>
> open("/sys/block/sdb/stat", O_RDONLY) = 10
> fstat(10, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff9ef70c000
> read(10, " 45168 22897 2239004 1505"..., 4096) = 99
> close(10) = 0
> munmap(0x7ff9ef70c000, 4096) = 0
>
> then...
>
> gettimeofday({1245665343, 22984}, NULL) = 0
> gettimeofday({1245665343, 23015}, NULL) = 0
> ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
> gettimeofday({1245665343, 23086}, NULL) = 0
> gettimeofday({1245665343, 23115}, NULL) = 0
> gettimeofday({1245665343, 23150}, NULL) = 0
> setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
> rt_sigreturn(0) = -1 EINTR (Interrupted system call)
> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
> nanosleep({0, 10000000}, NULL) = 0
> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
> nanosleep({0, 10000000}, NULL) = 0
> ...
> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
> nanosleep({0, 10000000}, NULL) = 0
>
> then back to the open & reads..
>
> The machine is a AMD X2 64bit with 3G memory & fio is 1.26.
>
> Any ideas what's going on? or suggestions to help track it down?
Your test is doing buffered IO, so around ~1.4 GB of memory is consumed
by io units, so machine is sluggish. And when test is about to finish,
fio does fsync, so all data is sync'd up, and it is taking little long
time. and the main thread/process is waiting for fsync to complete.
Please run the same test with direct IO .
Add "direct=1" in your jobfile.
Thanks,
-Guru
^ permalink raw reply [flat|nested] 9+ messages in thread
* unexpected results from simple test?
@ 2009-06-22 10:41 Richard Kennedy
2009-06-22 10:40 ` Gurudas Pai
2009-06-22 10:51 ` Jens Axboe
0 siblings, 2 replies; 9+ messages in thread
From: Richard Kennedy @ 2009-06-22 10:41 UTC (permalink / raw)
To: fio
I been running a hopefully simple fio test, but I don't understand the
results.
the test :-
fio --name=f1 --directory=/sda --rw=rw --size=400m
--name=f2 --directory=/sdb --rw=write --size=1g --startdelay=10
on linux 2.6.30-git
running this takes quite while, 'time fio ...' gives
real 3m49.174s
user 0m0.838s
sys 0m7.990s
(edited highlights of the fio output)
f1: (groupid=0, jobs=1): err= 0: pid=5181
read : io=200MiB, bw=23,480KiB/s, iops=5,870, runt= 8722msec
write: io=200MiB, bw=23,499KiB/s, iops=5,874, runt= 8715msec
f2: (groupid=0, jobs=1): err= 0: pid=5182
write: io=1,024MiB, bw=77,231KiB/s, iops=19,307, runt= 13577msec
The jobs seem to finish reasonable quickly, but then fio seem to be
stuck waiting for something. CPU usage is 100% but the CPU frequency
stays at the lowest level and the system is really sluggish in its
response. It does recover in time but this test does seem to torture my
system.
strace shows fio repeating this :-
open and read several disk stats, such as
open("/sys/block/sdb/stat", O_RDONLY) = 10
fstat(10, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff9ef70c000
read(10, " 45168 22897 2239004 1505"..., 4096) = 99
close(10) = 0
munmap(0x7ff9ef70c000, 4096) = 0
then...
gettimeofday({1245665343, 22984}, NULL) = 0
gettimeofday({1245665343, 23015}, NULL) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
gettimeofday({1245665343, 23086}, NULL) = 0
gettimeofday({1245665343, 23115}, NULL) = 0
gettimeofday({1245665343, 23150}, NULL) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
rt_sigreturn(0) = -1 EINTR (Interrupted system call)
wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
...
wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
nanosleep({0, 10000000}, NULL) = 0
then back to the open & reads..
The machine is a AMD X2 64bit with 3G memory & fio is 1.26.
Any ideas what's going on? or suggestions to help track it down?
thanks
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-22 10:41 unexpected results from simple test? Richard Kennedy
2009-06-22 10:40 ` Gurudas Pai
@ 2009-06-22 10:51 ` Jens Axboe
2009-06-22 11:16 ` Richard Kennedy
2009-06-22 14:20 ` Richard Kennedy
1 sibling, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2009-06-22 10:51 UTC (permalink / raw)
To: Richard Kennedy; +Cc: fio
On Mon, Jun 22 2009, Richard Kennedy wrote:
> I been running a hopefully simple fio test, but I don't understand the
> results.
>
> the test :-
>
> fio --name=f1 --directory=/sda --rw=rw --size=400m
> --name=f2 --directory=/sdb --rw=write --size=1g --startdelay=10
>
> on linux 2.6.30-git
>
> running this takes quite while, 'time fio ...' gives
> real 3m49.174s
> user 0m0.838s
> sys 0m7.990s
That does looke very odd.
> (edited highlights of the fio output)
> f1: (groupid=0, jobs=1): err= 0: pid=5181
> read : io=200MiB, bw=23,480KiB/s, iops=5,870, runt= 8722msec
> write: io=200MiB, bw=23,499KiB/s, iops=5,874, runt= 8715msec
> f2: (groupid=0, jobs=1): err= 0: pid=5182
> write: io=1,024MiB, bw=77,231KiB/s, iops=19,307, runt= 13577msec
>
> The jobs seem to finish reasonable quickly, but then fio seem to be
> stuck waiting for something. CPU usage is 100% but the CPU frequency
> stays at the lowest level and the system is really sluggish in its
> response. It does recover in time but this test does seem to torture my
> system.
>
> strace shows fio repeating this :-
> open and read several disk stats, such as
>
> open("/sys/block/sdb/stat", O_RDONLY) = 10
> fstat(10, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff9ef70c000
> read(10, " 45168 22897 2239004 1505"..., 4096) = 99
> close(10) = 0
> munmap(0x7ff9ef70c000, 4096) = 0
>
> then...
>
> gettimeofday({1245665343, 22984}, NULL) = 0
> gettimeofday({1245665343, 23015}, NULL) = 0
> ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
> gettimeofday({1245665343, 23086}, NULL) = 0
> gettimeofday({1245665343, 23115}, NULL) = 0
> gettimeofday({1245665343, 23150}, NULL) = 0
> setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
> rt_sigreturn(0) = -1 EINTR (Interrupted system call)
> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
> nanosleep({0, 10000000}, NULL) = 0
> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
> nanosleep({0, 10000000}, NULL) = 0
> ...
> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
> nanosleep({0, 10000000}, NULL) = 0
>
> then back to the open & reads..
This is normal, since it looks like you are stracing the parent process.
You'll want to inspect the two processes that are actually performing
the IO.
> The machine is a AMD X2 64bit with 3G memory & fio is 1.26.
Did you pull changes into that fio repo and not do a make clean before
compiling it? The build system is a bit broken, so it wont always
rebuild everything if eg fio.h had changes to structures. And that'll
cause weird behaviour sometimes.
In any case, please try and download a new fresh version and see if that
makes it any better.
There was also a recent linux kernel futex problem that fio triggered,
so it could also be a kernel issue. Again, knowing what the two other
jobs are doing would help.
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-22 10:40 ` Gurudas Pai
@ 2009-06-22 10:53 ` Jens Axboe
0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2009-06-22 10:53 UTC (permalink / raw)
To: Gurudas Pai; +Cc: Richard Kennedy, fio
On Mon, Jun 22 2009, Gurudas Pai wrote:
>
> Richard Kennedy wrote:
>> I been running a hopefully simple fio test, but I don't understand the
>> results.
>>
>> the test :-
>>
>> fio --name=f1 --directory=/sda --rw=rw --size=400m
>> --name=f2 --directory=/sdb --rw=write --size=1g --startdelay=10
>>
>> on linux 2.6.30-git
>>
>> running this takes quite while, 'time fio ...' gives
>> real 3m49.174s
>> user 0m0.838s
>> sys 0m7.990s
>>
>> (edited highlights of the fio output)
>> f1: (groupid=0, jobs=1): err= 0: pid=5181
>> read : io=200MiB, bw=23,480KiB/s, iops=5,870, runt= 8722msec
>> write: io=200MiB, bw=23,499KiB/s, iops=5,874, runt= 8715msec
>> f2: (groupid=0, jobs=1): err= 0: pid=5182
>> write: io=1,024MiB, bw=77,231KiB/s, iops=19,307, runt= 13577msec
>>
>> The jobs seem to finish reasonable quickly, but then fio seem to be
>> stuck waiting for something. CPU usage is 100% but the CPU frequency
>> stays at the lowest level and the system is really sluggish in its
>> response. It does recover in time but this test does seem to torture my
>> system.
>>
>> strace shows fio repeating this :-
>> open and read several disk stats, such as
>>
>> open("/sys/block/sdb/stat", O_RDONLY) = 10
>> fstat(10, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff9ef70c000
>> read(10, " 45168 22897 2239004 1505"..., 4096) = 99
>> close(10) = 0
>> munmap(0x7ff9ef70c000, 4096) = 0
>>
>> then...
>>
>> gettimeofday({1245665343, 22984}, NULL) = 0
>> gettimeofday({1245665343, 23015}, NULL) = 0
>> ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
>> gettimeofday({1245665343, 23086}, NULL) = 0
>> gettimeofday({1245665343, 23115}, NULL) = 0
>> gettimeofday({1245665343, 23150}, NULL) = 0
>> setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
>> rt_sigreturn(0) = -1 EINTR (Interrupted system call)
>> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
>> nanosleep({0, 10000000}, NULL) = 0
>> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
>> nanosleep({0, 10000000}, NULL) = 0
>> ...
>> wait4(5321, 0x7fffcfed49d4, WNOHANG, NULL) = 0
>> nanosleep({0, 10000000}, NULL) = 0
>>
>> then back to the open & reads..
>>
>> The machine is a AMD X2 64bit with 3G memory & fio is 1.26.
>>
>> Any ideas what's going on? or suggestions to help track it down?
>
> Your test is doing buffered IO, so around ~1.4 GB of memory is consumed
> by io units, so machine is sluggish. And when test is about to finish,
> fio does fsync, so all data is sync'd up, and it is taking little long
> time. and the main thread/process is waiting for fsync to complete.
> Please run the same test with direct IO .
> Add "direct=1" in your jobfile.
It wont fsync by default, so I don't think that is it. But if we get an
overview of where the other fio processes are stuck, it'll be a bit
easier to see what the problem is here :-)
If you are in fsync, then fio will also list an 'F' for that job in the
status line. So that should be easy to tell.
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-22 10:51 ` Jens Axboe
@ 2009-06-22 11:16 ` Richard Kennedy
2009-06-22 14:20 ` Richard Kennedy
1 sibling, 0 replies; 9+ messages in thread
From: Richard Kennedy @ 2009-06-22 11:16 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
On Mon, 2009-06-22 at 12:51 +0200, Jens Axboe wrote:
>
> Did you pull changes into that fio repo and not do a make clean before
> compiling it? The build system is a bit broken, so it wont always
> rebuild everything if eg fio.h had changes to structures. And that'll
> cause weird behaviour sometimes.
> In any case, please try and download a new fresh version and see if that
> makes it any better.
>
> There was also a recent linux kernel futex problem that fio triggered,
> so it could also be a kernel issue. Again, knowing what the two other
> jobs are doing would help.
>
I built fio from fio-1.26.tar.gz so it was in a clean directory.
I've just downloaded the 1.28 snapshot and will try that, but I can try
the git version if you like.
I get the status of both jobs but it get stuck at
> Jobs: 1 (f=0): [_W] [100.0% done] [ 0/ 0 kb/s] [eta 00m:00s]
I'll try 1.28 & let you know.
I haven't updated the kernel is a few days, so I can try the latest git
as well.
regards
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-22 10:51 ` Jens Axboe
2009-06-22 11:16 ` Richard Kennedy
@ 2009-06-22 14:20 ` Richard Kennedy
2009-06-22 17:57 ` Jens Axboe
1 sibling, 1 reply; 9+ messages in thread
From: Richard Kennedy @ 2009-06-22 14:20 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
> In any case, please try and download a new fresh version and see if that
> makes it any better.
>
> There was also a recent linux kernel futex problem that fio triggered,
> so it could also be a kernel issue. Again, knowing what the two other
> jobs are doing would help.
>
Hi Jens,
'strace -f' running on a freshly built fio 1.28
It runs somewhat slowly but I just get lots of reads & writes like these
[pid 6597] gettimeofday({1245679454, 899777}, NULL) = 0
[pid 6597] lseek(8, 419278848, SEEK_SET) = 419278848
[pid 6597] gettimeofday({1245679454, 899917}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 899988}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 900059}, NULL) = 0
[pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
\326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
[pid 6597] gettimeofday({1245679454, 900244}, NULL) = 0
[pid 6597] lseek(8, 419282944, SEEK_SET) = 419282944
[pid 6597] gettimeofday({1245679454, 900391}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 900463}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 900534}, NULL) = 0
[pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
\326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
[pid 6597] gettimeofday({1245679454, 900705}, NULL) = 0
[pid 6597] lseek(8, 419287040, SEEK_SET) = 419287040
[pid 6597] gettimeofday({1245679454, 900845}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 900914}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 900985}, NULL) = 0
[pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
\326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
[pid 6597] gettimeofday({1245679454, 901171}, NULL) = 0
[pid 6597] lseek(8, 419291136, SEEK_SET) = 419291136
[pid 6597] gettimeofday({1245679454, 901309}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 901379}, NULL) = 0
[pid 6597] gettimeofday({1245679454, 901460}, NULL) = 0
[pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
\326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
where pid 6597 is job f1
I'll rebuild my kernel from the latest git, and check that.
regards
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-22 14:20 ` Richard Kennedy
@ 2009-06-22 17:57 ` Jens Axboe
2009-06-23 10:02 ` Richard Kennedy
0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2009-06-22 17:57 UTC (permalink / raw)
To: Richard Kennedy; +Cc: fio
On Mon, Jun 22 2009, Richard Kennedy wrote:
>
> > In any case, please try and download a new fresh version and see if that
> > makes it any better.
> >
> > There was also a recent linux kernel futex problem that fio triggered,
> > so it could also be a kernel issue. Again, knowing what the two other
> > jobs are doing would help.
> >
> Hi Jens,
>
> 'strace -f' running on a freshly built fio 1.28
>
> It runs somewhat slowly but I just get lots of reads & writes like these
>
> [pid 6597] gettimeofday({1245679454, 899777}, NULL) = 0
> [pid 6597] lseek(8, 419278848, SEEK_SET) = 419278848
> [pid 6597] gettimeofday({1245679454, 899917}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 899988}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 900059}, NULL) = 0
> [pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> [pid 6597] gettimeofday({1245679454, 900244}, NULL) = 0
> [pid 6597] lseek(8, 419282944, SEEK_SET) = 419282944
> [pid 6597] gettimeofday({1245679454, 900391}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 900463}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 900534}, NULL) = 0
> [pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> [pid 6597] gettimeofday({1245679454, 900705}, NULL) = 0
> [pid 6597] lseek(8, 419287040, SEEK_SET) = 419287040
> [pid 6597] gettimeofday({1245679454, 900845}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 900914}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 900985}, NULL) = 0
> [pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> [pid 6597] gettimeofday({1245679454, 901171}, NULL) = 0
> [pid 6597] lseek(8, 419291136, SEEK_SET) = 419291136
> [pid 6597] gettimeofday({1245679454, 901309}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 901379}, NULL) = 0
> [pid 6597] gettimeofday({1245679454, 901460}, NULL) = 0
> [pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
>
>
> where pid 6597 is job f1
>
> I'll rebuild my kernel from the latest git, and check that.
So it's still running, perhaps the eta is just off. Is write back
caching disabled on that drive?
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-22 17:57 ` Jens Axboe
@ 2009-06-23 10:02 ` Richard Kennedy
2009-06-23 10:54 ` Jens Axboe
0 siblings, 1 reply; 9+ messages in thread
From: Richard Kennedy @ 2009-06-23 10:02 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
On Mon, 2009-06-22 at 19:57 +0200, Jens Axboe wrote:
> On Mon, Jun 22 2009, Richard Kennedy wrote:
> >
> > > In any case, please try and download a new fresh version and see if that
> > > makes it any better.
> > >
> > > There was also a recent linux kernel futex problem that fio triggered,
> > > so it could also be a kernel issue. Again, knowing what the two other
> > > jobs are doing would help.
> > >
> > Hi Jens,
> >
> > 'strace -f' running on a freshly built fio 1.28
> >
> > It runs somewhat slowly but I just get lots of reads & writes like these
> >
> > [pid 6597] gettimeofday({1245679454, 899777}, NULL) = 0
> > [pid 6597] lseek(8, 419278848, SEEK_SET) = 419278848
> > [pid 6597] gettimeofday({1245679454, 899917}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 899988}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 900059}, NULL) = 0
> > [pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > [pid 6597] gettimeofday({1245679454, 900244}, NULL) = 0
> > [pid 6597] lseek(8, 419282944, SEEK_SET) = 419282944
> > [pid 6597] gettimeofday({1245679454, 900391}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 900463}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 900534}, NULL) = 0
> > [pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > [pid 6597] gettimeofday({1245679454, 900705}, NULL) = 0
> > [pid 6597] lseek(8, 419287040, SEEK_SET) = 419287040
> > [pid 6597] gettimeofday({1245679454, 900845}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 900914}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 900985}, NULL) = 0
> > [pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > [pid 6597] gettimeofday({1245679454, 901171}, NULL) = 0
> > [pid 6597] lseek(8, 419291136, SEEK_SET) = 419291136
> > [pid 6597] gettimeofday({1245679454, 901309}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 901379}, NULL) = 0
> > [pid 6597] gettimeofday({1245679454, 901460}, NULL) = 0
> > [pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> >
> >
> > where pid 6597 is job f1
> >
> > I'll rebuild my kernel from the latest git, and check that.
>
> So it's still running, perhaps the eta is just off. Is write back
> caching disabled on that drive?
>
It seems better on the latest git kernel, the time for the complete test
is now 1m30.
Although the eta times do seem strange. As fio starts up I see extremely
large eta's flash up , i.e. some huge number of days, but I haven't
managed to capture one yet, and it says it's finished long before it
actually stops.
Yes, strangely the write_cache is disabled on sda, but enabled on sdb.
I'll turn it on and see if there's any difference.
regards
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: unexpected results from simple test?
2009-06-23 10:02 ` Richard Kennedy
@ 2009-06-23 10:54 ` Jens Axboe
0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2009-06-23 10:54 UTC (permalink / raw)
To: Richard Kennedy; +Cc: fio
On Tue, Jun 23 2009, Richard Kennedy wrote:
> On Mon, 2009-06-22 at 19:57 +0200, Jens Axboe wrote:
> > On Mon, Jun 22 2009, Richard Kennedy wrote:
> > >
> > > > In any case, please try and download a new fresh version and see if that
> > > > makes it any better.
> > > >
> > > > There was also a recent linux kernel futex problem that fio triggered,
> > > > so it could also be a kernel issue. Again, knowing what the two other
> > > > jobs are doing would help.
> > > >
> > > Hi Jens,
> > >
> > > 'strace -f' running on a freshly built fio 1.28
> > >
> > > It runs somewhat slowly but I just get lots of reads & writes like these
> > >
> > > [pid 6597] gettimeofday({1245679454, 899777}, NULL) = 0
> > > [pid 6597] lseek(8, 419278848, SEEK_SET) = 419278848
> > > [pid 6597] gettimeofday({1245679454, 899917}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 899988}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 900059}, NULL) = 0
> > > [pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > > [pid 6597] gettimeofday({1245679454, 900244}, NULL) = 0
> > > [pid 6597] lseek(8, 419282944, SEEK_SET) = 419282944
> > > [pid 6597] gettimeofday({1245679454, 900391}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 900463}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 900534}, NULL) = 0
> > > [pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > > [pid 6597] gettimeofday({1245679454, 900705}, NULL) = 0
> > > [pid 6597] lseek(8, 419287040, SEEK_SET) = 419287040
> > > [pid 6597] gettimeofday({1245679454, 900845}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 900914}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 900985}, NULL) = 0
> > > [pid 6597] write(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > > [pid 6597] gettimeofday({1245679454, 901171}, NULL) = 0
> > > [pid 6597] lseek(8, 419291136, SEEK_SET) = 419291136
> > > [pid 6597] gettimeofday({1245679454, 901309}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 901379}, NULL) = 0
> > > [pid 6597] gettimeofday({1245679454, 901460}, NULL) = 0
> > > [pid 6597] read(8, "+\265bB\305\243g+\214\313\313\377\20H\237\356\266
> > > \326X\335<\37\317K\207\317,/\335J\206\267\33"..., 4096) = 4096
> > >
> > >
> > > where pid 6597 is job f1
> > >
> > > I'll rebuild my kernel from the latest git, and check that.
> >
> > So it's still running, perhaps the eta is just off. Is write back
> > caching disabled on that drive?
> >
>
> It seems better on the latest git kernel, the time for the complete test
> is now 1m30.
> Although the eta times do seem strange. As fio starts up I see extremely
> large eta's flash up , i.e. some huge number of days, but I haven't
> managed to capture one yet, and it says it's finished long before it
> actually stops.
>
> Yes, strangely the write_cache is disabled on sda, but enabled on sdb.
>
> I'll turn it on and see if there's any difference.
So at this point I'm not sure exactly what you see as the problem. If
write caching is disabled, a read/write mix or randwrite is going to be
really slow. 90 seconds is ~4MB/sec, that sounds very plausible.
The eta may not be too good, probably because initially you are just
offloading to page cache. Once you start hitting platter, speeds will
plummet.
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-06-23 10:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-22 10:41 unexpected results from simple test? Richard Kennedy
2009-06-22 10:40 ` Gurudas Pai
2009-06-22 10:53 ` Jens Axboe
2009-06-22 10:51 ` Jens Axboe
2009-06-22 11:16 ` Richard Kennedy
2009-06-22 14:20 ` Richard Kennedy
2009-06-22 17:57 ` Jens Axboe
2009-06-23 10:02 ` Richard Kennedy
2009-06-23 10:54 ` Jens Axboe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.