All of lore.kernel.org
 help / color / mirror / Atom feed
* Very slow O_DIRECT writes on NFS in .36
@ 2010-11-18 20:07 Ben Greear
  2010-11-18 20:17 ` Chuck Lever
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Greear @ 2010-11-18 20:07 UTC (permalink / raw)
  To: linux-nfs

I applied the NFS O_DIRECT patch (and all others) from the pending 2.6.36 stable
queue, and now I can at least use O_DIRECT w/out immediate failure.

However, I notice that when writing 2k chunks with O_DIRECT on
NFS, it runs extremely slowly (about 300Kbps throughput).  The
server is a Fedora 13 64-bit system running 2.6.34.7-56.fc13.x86_64

Here's some strace -ttT output for the writer:

07:03:42.898058 write(9, "\370'\37\345v\230\315\253\3\0\0\0\354\7\0\0\16\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048 <0.059402>
07:03:42.957649 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000266>
07:03:42.958148 write(9, "\212$s\327v\230\315\253\3\0\0\0\354\7\0\0\17\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048 <0.069295>
07:03:43.027524 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000011>


Writing 64k chunks takes basically the same amount of time per system call:

07:06:13.537488 write(9, "\5\340\202\262v\230\315\253\3\0\0\0\354\377\0\0\6\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536 <0.049462>
07:06:13.587083 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000035>
07:06:13.587410 write(9, "\250\231\377cv\230\315\253\3\0\0\0\354\377\0\0\7\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536 <0.058612>
07:06:13.646233 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000095>
07:06:13.646616 write(9, "\5-@\5v\230\315\253\3\0\0\0\354\377\0\0\10\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536 <0.050282>


Reading is a good deal faster..about 34Mbps with O_DIRECT, NFS and 2k reads.

Any ideas about why the write performance is so bad?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Very slow O_DIRECT writes on NFS in .36
  2010-11-18 20:07 Very slow O_DIRECT writes on NFS in .36 Ben Greear
@ 2010-11-18 20:17 ` Chuck Lever
  2010-11-18 20:23   ` Ben Greear
  0 siblings, 1 reply; 4+ messages in thread
From: Chuck Lever @ 2010-11-18 20:17 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-nfs


On Nov 18, 2010, at 3:07 PM, Ben Greear wrote:

> I applied the NFS O_DIRECT patch (and all others) from the pending 2.6.36 stable
> queue, and now I can at least use O_DIRECT w/out immediate failure.
> 
> However, I notice that when writing 2k chunks with O_DIRECT on
> NFS, it runs extremely slowly (about 300Kbps throughput).  The
> server is a Fedora 13 64-bit system running 2.6.34.7-56.fc13.x86_64
> 
> Here's some strace -ttT output for the writer:
> 
> 07:03:42.898058 write(9, "\370'\37\345v\230\315\253\3\0\0\0\354\7\0\0\16\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048 <0.059402>
> 07:03:42.957649 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000266>
> 07:03:42.958148 write(9, "\212$s\327v\230\315\253\3\0\0\0\354\7\0\0\17\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048 <0.069295>
> 07:03:43.027524 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000011>
> 
> 
> Writing 64k chunks takes basically the same amount of time per system call:
> 
> 07:06:13.537488 write(9, "\5\340\202\262v\230\315\253\3\0\0\0\354\377\0\0\6\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536 <0.049462>
> 07:06:13.587083 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000035>
> 07:06:13.587410 write(9, "\250\231\377cv\230\315\253\3\0\0\0\354\377\0\0\7\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536 <0.058612>
> 07:06:13.646233 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout) <0.000095>
> 07:06:13.646616 write(9, "\5-@\5v\230\315\253\3\0\0\0\354\377\0\0\10\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536 <0.050282>
> 
> 
> Reading is a good deal faster..about 34Mbps with O_DIRECT, NFS and 2k reads.
> 
> Any ideas about why the write performance is so bad?

A network trace will probably show you that the per-write latency is due to the server.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Very slow O_DIRECT writes on NFS in .36
  2010-11-18 20:17 ` Chuck Lever
@ 2010-11-18 20:23   ` Ben Greear
  2010-11-18 22:42     ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Greear @ 2010-11-18 20:23 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On 11/18/2010 12:17 PM, Chuck Lever wrote:
>
> On Nov 18, 2010, at 3:07 PM, Ben Greear wrote:
>
>> I applied the NFS O_DIRECT patch (and all others) from the pending 2.6.36 stable
>> queue, and now I can at least use O_DIRECT w/out immediate failure.
>>
>> However, I notice that when writing 2k chunks with O_DIRECT on
>> NFS, it runs extremely slowly (about 300Kbps throughput).  The
>> server is a Fedora 13 64-bit system running 2.6.34.7-56.fc13.x86_64
>>
>> Here's some strace -ttT output for the writer:
>>
>> 07:03:42.898058 write(9, "\370'\37\345v\230\315\253\3\0\0\0\354\7\0\0\16\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048<0.059402>
>> 07:03:42.957649 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000266>
>> 07:03:42.958148 write(9, "\212$s\327v\230\315\253\3\0\0\0\354\7\0\0\17\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048<0.069295>
>> 07:03:43.027524 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000011>
>>
>>
>> Writing 64k chunks takes basically the same amount of time per system call:
>>
>> 07:06:13.537488 write(9, "\5\340\202\262v\230\315\253\3\0\0\0\354\377\0\0\6\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536<0.049462>
>> 07:06:13.587083 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000035>
>> 07:06:13.587410 write(9, "\250\231\377cv\230\315\253\3\0\0\0\354\377\0\0\7\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536<0.058612>
>> 07:06:13.646233 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000095>
>> 07:06:13.646616 write(9, "\5-@\5v\230\315\253\3\0\0\0\354\377\0\0\10\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536<0.050282>
>>
>>
>> Reading is a good deal faster..about 34Mbps with O_DIRECT, NFS and 2k reads.
>>
>> Any ideas about why the write performance is so bad?
>
> A network trace will probably show you that the per-write latency is due to the server.
>

Looks like you are right.  I don't remember it being this slow before, but
maybe it was.  We'll run some tests with older kernels and/or different
servers.

   6.700193 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:96256 Len:1024 FILE_SYNC
   6.740547 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12321 Ack=101729 Win=501 Len=0 TSV=218471603 TSER=1385525
   6.769380 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 262) Len:1024 FILE_SYNC
   6.769609 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:97280 Len:1024 FILE_SYNC
   6.809777 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12461 Ack=102885 Win=501 Len=0 TSV=218471673 TSER=1385594
   6.850373 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 265) Len:1024 FILE_SYNC
   6.850631 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:98304 Len:1024 FILE_SYNC
   6.890845 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12601 Ack=104041 Win=501 Len=0 TSV=218471754 TSER=1385675
   6.930344 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 268) Len:1024 FILE_SYNC
   6.930703 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:99328 Len:1024 FILE_SYNC
   6.971753 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12741 Ack=105197 Win=501 Len=0 TSV=218471834 TSER=1385755
   6.980341 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 271) Len:1024 FILE_SYNC


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Very slow O_DIRECT writes on NFS in .36
  2010-11-18 20:23   ` Ben Greear
@ 2010-11-18 22:42     ` Trond Myklebust
  0 siblings, 0 replies; 4+ messages in thread
From: Trond Myklebust @ 2010-11-18 22:42 UTC (permalink / raw)
  To: Ben Greear; +Cc: Chuck Lever, linux-nfs

On Thu, 2010-11-18 at 12:23 -0800, Ben Greear wrote:
> On 11/18/2010 12:17 PM, Chuck Lever wrote:
> >
> > On Nov 18, 2010, at 3:07 PM, Ben Greear wrote:
> >
> >> I applied the NFS O_DIRECT patch (and all others) from the pending 2.6.36 stable
> >> queue, and now I can at least use O_DIRECT w/out immediate failure.
> >>
> >> However, I notice that when writing 2k chunks with O_DIRECT on
> >> NFS, it runs extremely slowly (about 300Kbps throughput).  The
> >> server is a Fedora 13 64-bit system running 2.6.34.7-56.fc13.x86_64
> >>
> >> Here's some strace -ttT output for the writer:
> >>
> >> 07:03:42.898058 write(9, "\370'\37\345v\230\315\253\3\0\0\0\354\7\0\0\16\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048<0.059402>
> >> 07:03:42.957649 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000266>
> >> 07:03:42.958148 write(9, "\212$s\327v\230\315\253\3\0\0\0\354\7\0\0\17\1\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 2048) = 2048<0.069295>
> >> 07:03:43.027524 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000011>
> >>
> >>
> >> Writing 64k chunks takes basically the same amount of time per system call:
> >>
> >> 07:06:13.537488 write(9, "\5\340\202\262v\230\315\253\3\0\0\0\354\377\0\0\6\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536<0.049462>
> >> 07:06:13.587083 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000035>
> >> 07:06:13.587410 write(9, "\250\231\377cv\230\315\253\3\0\0\0\354\377\0\0\7\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536<0.058612>
> >> 07:06:13.646233 poll([{fd=4, events=POLLIN}, {fd=8, events=POLLIN}, {fd=6, events=POLLIN}], 3, 0) = 0 (Timeout)<0.000095>
> >> 07:06:13.646616 write(9, "\5-@\5v\230\315\253\3\0\0\0\354\377\0\0\10\0\0\0\0\1\2\3\4\5\6\7\10\t\n\v"..., 65536) = 65536<0.050282>
> >>
> >>
> >> Reading is a good deal faster..about 34Mbps with O_DIRECT, NFS and 2k reads.
> >>
> >> Any ideas about why the write performance is so bad?
> >
> > A network trace will probably show you that the per-write latency is due to the server.
> >
> 
> Looks like you are right.  I don't remember it being this slow before, but
> maybe it was.  We'll run some tests with older kernels and/or different
> servers.
> 
>    6.700193 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:96256 Len:1024 FILE_SYNC
>    6.740547 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12321 Ack=101729 Win=501 Len=0 TSV=218471603 TSER=1385525
>    6.769380 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 262) Len:1024 FILE_SYNC
>    6.769609 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:97280 Len:1024 FILE_SYNC
>    6.809777 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12461 Ack=102885 Win=501 Len=0 TSV=218471673 TSER=1385594
>    6.850373 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 265) Len:1024 FILE_SYNC
>    6.850631 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:98304 Len:1024 FILE_SYNC
>    6.890845 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12601 Ack=104041 Win=501 Len=0 TSV=218471754 TSER=1385675
>    6.930344 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 268) Len:1024 FILE_SYNC
>    6.930703 192.168.100.173 -> 192.168.100.3 NFS V3 WRITE Call, FH:0x6bc05782 Offset:99328 Len:1024 FILE_SYNC
>    6.971753 192.168.100.3 -> 192.168.100.173 TCP 2049 > 800 [ACK] Seq=12741 Ack=105197 Win=501 Len=0 TSV=218471834 TSER=1385755
>    6.980341 192.168.100.3 -> 192.168.100.173 NFS V3 WRITE Reply (Call In 271) Len:1024 FILE_SYNC

This is all expected. Doing direct i/o with small chunks is always going
to be slow, since the kernel is required to ensure all the data has been
synchronised to disk before the syscall can complete.

You can significantly speed things up by grouping several writes into a
single syscall using the vectored write interface (i.e. writev()). That
allows the kernel to use unstable writes to the server and then issue a
single COMMIT call to sync all the data from the vector at once.

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-11-18 22:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-18 20:07 Very slow O_DIRECT writes on NFS in .36 Ben Greear
2010-11-18 20:17 ` Chuck Lever
2010-11-18 20:23   ` Ben Greear
2010-11-18 22:42     ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.