All of lore.kernel.org
 help / color / mirror / Atom feed
* strange performance issues with OS X 10.6 client
@ 2010-04-15 21:49 Stefan Krüger
  2010-04-19 12:21 ` Stefan Krüger
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Krüger @ 2010-04-15 21:49 UTC (permalink / raw)
  To: linux-nfs

Hello list,

I have some really strange nfs performance issues

NFS server is Fedora 12, running
* kernel-2.6.32.11-99.fc12.x86_64 and
* nfs-utils-1.2.1-4.fc12.x86_64
* nfs shared /home is ext4 with default mount options

/etc/exports:
/home   192.168.1.0/255.255.255.0(rw,sync)

nfs and nfslock are up and running

Nothing else touched on the server nfs-wise.

NFS client is Mac OS X, version 10.6.3

My /home dir is automounted on the Mac with the following mount options:
* nosuid,nodev,resvport,rdirplus,rwsize=1048576
(nfsv3 and tcp are default, I have also tried udp, and with and without
rdirplus, with different read/write sizes (started with 32k, less for udp,
and then cranked it up to 1m to make the beachball appear less often), but I
still have issues no matter which options I chose)

Anyway, I'm stuck now, surfing the web with Safari is a very unpleasant
experience on nfs, beachball every now and then together with a huge amount
of network traffic (RX with 20MB/s+ peaks), not unusual to see several
gigabytes received after some minutes browsing, XCode shows a ''The
document "SomeFile.m" could not be saved.''-error after some edits, Opera
hangs for minutes when closing, etc etc.

It's horrible :(

Another example, extracting
http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3min!

$ time tar xzf Cocoa-3rd.tgz
0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
$ time rm -rf Solutions-Cocoa-3rd/
0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w

So any help or hints really appreciated

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-15 21:49 strange performance issues with OS X 10.6 client Stefan Krüger
@ 2010-04-19 12:21 ` Stefan Krüger
  2010-04-19 16:10   ` Stefan Krüger
  2010-04-19 16:59   ` Chuck Lever
  0 siblings, 2 replies; 9+ messages in thread
From: Stefan Krüger @ 2010-04-19 12:21 UTC (permalink / raw)
  To: linux-nfs

On Thu, 15 Apr 2010, Stefan Krüger wrote:

> Hello list,
> 
> I have some really strange nfs performance issues
> 
> NFS server is Fedora 12, running
> * kernel-2.6.32.11-99.fc12.x86_64 and
> * nfs-utils-1.2.1-4.fc12.x86_64
> * nfs shared /home is ext4 with default mount options
> 
> /etc/exports:
> /home   192.168.1.0/255.255.255.0(rw,sync)
> 
> nfs and nfslock are up and running
> 
> Nothing else touched on the server nfs-wise.
> 
> NFS client is Mac OS X, version 10.6.3
> 
> My /home dir is automounted on the Mac with the following mount options:
> * nosuid,nodev,resvport,rdirplus,rwsize=1048576
> (nfsv3 and tcp are default, I have also tried udp, and with and without
> rdirplus, with different read/write sizes (started with 32k, less for udp,
> and then cranked it up to 1m to make the beachball appear less often), but I
> still have issues no matter which options I chose)
> 
> Anyway, I'm stuck now, surfing the web with Safari is a very unpleasant
> experience on nfs, beachball every now and then together with a huge amount
> of network traffic (RX with 20MB/s+ peaks), not unusual to see several
> gigabytes received after some minutes browsing, XCode shows a ''The
> document "SomeFile.m" could not be saved.''-error after some edits, Opera
> hangs for minutes when closing, etc etc.
> 
> It's horrible :(
> 
> Another example, extracting
> http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3min!
> 
> $ time tar xzf Cocoa-3rd.tgz
> 0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
> $ time rm -rf Solutions-Cocoa-3rd/
> 0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w
> 
> So any help or hints really appreciated

So, no answers yet, but I did some more tests, i.e. I tried extracting the
Cocoa-3rd.tgz (2.2MB, 12MB untar'ed) on FreeBSD 8.0-REL (running inside
VMWare though), and still it was much faster (5:51.10 vs 0:09.35) than
extracting on bare metal fedora12:

$ time tar xfz Cocoa-3rd.tgz 
0.104u 1.474s 0:09.35 16.7%	0+0k 0+4896io 0pf+0w
$ time rm -rf Solutions-Cocoa-3rd
0.006u 0.160s 0:01.24 12.9%	0+0k 0+0io 0pf+0w

I captured the nfs traffic with tcpdump (tcpdump -i eth1 -s 0 -w nfs.out
host nfssrv and port 2049) on both freebsd8 (interface for freebsd is a bit
different ofc) and fedora12 while running

tar xfz Cocoa-3rd.tgz Solutions-Cocoa-3rd/02_GetStarted

(which extracts just a couple of files) , you can find them here:

Fedora 12 tcpdump -> http://www.dpaste.org/5cvp/
FreeBSD 8 tcpdump -> http://www.dpaste.org/uCGX/

Honestly, I don't see anything suspicious, but maybe you guys will catch
something

Anyway, if you need more info or want to me to test something else, just drop
me a line

TIA

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-19 12:21 ` Stefan Krüger
@ 2010-04-19 16:10   ` Stefan Krüger
  2010-04-19 16:59   ` Chuck Lever
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Krüger @ 2010-04-19 16:10 UTC (permalink / raw)
  To: linux-nfs

Ok, how about this one..

FreeBSD nfsstat -s (statistics zeroed before running tar) after extracting
Cocoa-3rd.tgz ->

Server Info:
Getattr   Setattr    Lookup  Readlink      Read     Write    Create Remove
    628      6419      5871         0       627      2488      1989 627
Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus Access
     0         0         0       227         0         0         0 5529
    Mknod    Fsstat    Fsinfo  PathConf    Commit
        0         2         1         1      2318
Server Ret-Failed
             5871
Server Faults
            0
Server Cache Stats:
   Inprog      Idem  Non-idem    Misses
        0         0         0         0
Server Write Gathering:
 WriteOps  WriteRPC   Opsaved
     2488      2488         0


Fedora 12 nfsstat -s ->

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
33275      0          0          0          0       

Server nfs v3:
null         getattr      setattr      lookup       access       readlink     
0         0% 627       1% 6426     19% 8523     25% 9111     27% 0         0% 
read         write        create       mkdir        symlink      mknod        
994       2% 2362      7% 1996      5% 221       0% 0         0% 0         0% 
remove       rmdir        rename       link         readdir      readdirplus  
660       1% 0         0% 0         0% 0         0% 0         0% 1         0% 
fsstat       fsinfo       pathconf     commit       
1         0% 0         0% 0         0% 2353      7% 

So... 33275 RPC calls for 1589 directories/files?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-19 12:21 ` Stefan Krüger
  2010-04-19 16:10   ` Stefan Krüger
@ 2010-04-19 16:59   ` Chuck Lever
  2010-04-20 21:21     ` Stefan Krüger
  1 sibling, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2010-04-19 16:59 UTC (permalink / raw)
  To: Stefan Krüger; +Cc: linux-nfs

On 04/19/2010 08:21 AM, Stefan Krüger wrote:
> On Thu, 15 Apr 2010, Stefan Krüger wrote:
>
>> Hello list,
>>
>> I have some really strange nfs performance issues
>>
>> NFS server is Fedora 12, running
>> * kernel-2.6.32.11-99.fc12.x86_64 and
>> * nfs-utils-1.2.1-4.fc12.x86_64
>> * nfs shared /home is ext4 with default mount options
>>
>> /etc/exports:
>> /home   192.168.1.0/255.255.255.0(rw,sync)
>>
>> nfs and nfslock are up and running
>>
>> Nothing else touched on the server nfs-wise.
>>
>> NFS client is Mac OS X, version 10.6.3
>>
>> My /home dir is automounted on the Mac with the following mount options:
>> * nosuid,nodev,resvport,rdirplus,rwsize=1048576
>> (nfsv3 and tcp are default, I have also tried udp, and with and without
>> rdirplus, with different read/write sizes (started with 32k, less for udp,
>> and then cranked it up to 1m to make the beachball appear less often), but I
>> still have issues no matter which options I chose)
>>
>> Anyway, I'm stuck now, surfing the web with Safari is a very unpleasant
>> experience on nfs, beachball every now and then together with a huge amount
>> of network traffic (RX with 20MB/s+ peaks), not unusual to see several
>> gigabytes received after some minutes browsing, XCode shows a ''The
>> document "SomeFile.m" could not be saved.''-error after some edits, Opera
>> hangs for minutes when closing, etc etc.
>>
>> It's horrible :(
>>
>> Another example, extracting
>> http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3min!
>>
>> $ time tar xzf Cocoa-3rd.tgz
>> 0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
>> $ time rm -rf Solutions-Cocoa-3rd/
>> 0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w
>>
>> So any help or hints really appreciated
>
> So, no answers yet, but I did some more tests, i.e. I tried extracting the
> Cocoa-3rd.tgz (2.2MB, 12MB untar'ed) on FreeBSD 8.0-REL (running inside
> VMWare though), and still it was much faster (5:51.10 vs 0:09.35) than
> extracting on bare metal fedora12:
>
> $ time tar xfz Cocoa-3rd.tgz
> 0.104u 1.474s 0:09.35 16.7%	0+0k 0+4896io 0pf+0w
> $ time rm -rf Solutions-Cocoa-3rd
> 0.006u 0.160s 0:01.24 12.9%	0+0k 0+0io 0pf+0w
>
> I captured the nfs traffic with tcpdump (tcpdump -i eth1 -s 0 -w nfs.out
> host nfssrv and port 2049) on both freebsd8 (interface for freebsd is a bit
> different ofc) and fedora12 while running
>
> tar xfz Cocoa-3rd.tgz Solutions-Cocoa-3rd/02_GetStarted
>
> (which extracts just a couple of files) , you can find them here:
>
> Fedora 12 tcpdump ->  http://www.dpaste.org/5cvp/
> FreeBSD 8 tcpdump ->  http://www.dpaste.org/uCGX/

The number of packets is around 1800 for the FreeBSD server and around 
1940 for the Linux server.  The RPC counts you posted in a later email 
show that Linux does more LOOKUP and ACCESS requests.  But generally, it 
looks like your client is doing roughly the same amount of work in both 
cases.

But what catches my eye in the F12 tcpdump is that there are pauses 
where the server reply is delayed by a few milliseconds after a SETATTR 
or COMMIT.  This looks normal, since disk writes can take a few 
milliseconds.

FreeBSD doesn't appear to have these pauses, so I suspect FreeBSD is 
doing something illegal.  No NFS server can turn a SETATTR around in 
just a few microseconds and claim that it is on permanent storage, 
unless it has some kind of NVRAM.

-- 
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-19 16:59   ` Chuck Lever
@ 2010-04-20 21:21     ` Stefan Krüger
  2010-04-20 21:40       ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Krüger @ 2010-04-20 21:21 UTC (permalink / raw)
  To: linux-nfs; +Cc: Chuck Lever

On Mon, 19 Apr 2010, Chuck Lever wrote:

> On 04/19/2010 08:21 AM, Stefan Kr=FCger wrote:
> > On Thu, 15 Apr 2010, Stefan Kr=FCger wrote:
> >
> >> Hello list,
> >>
> >> I have some really strange nfs performance issues
> >>
> >> NFS server is Fedora 12, running
> >> * kernel-2.6.32.11-99.fc12.x86_64 and
> >> * nfs-utils-1.2.1-4.fc12.x86_64
> >> * nfs shared /home is ext4 with default mount options
> >>
> >> /etc/exports:
> >> /home   192.168.1.0/255.255.255.0(rw,sync)
> >>
> >> nfs and nfslock are up and running
> >>
> >> Nothing else touched on the server nfs-wise.
> >>
> >> NFS client is Mac OS X, version 10.6.3
> >>
> >> My /home dir is automounted on the Mac with the following mount op=
tions:
> >> * nosuid,nodev,resvport,rdirplus,rwsize=3D1048576
> >> (nfsv3 and tcp are default, I have also tried udp, and with and wi=
thout
> >> rdirplus, with different read/write sizes (started with 32k, less =
for udp,
> >> and then cranked it up to 1m to make the beachball appear less oft=
en),
> >> but I still have issues no matter which options I chose)
> >>
> >> Anyway, I'm stuck now, surfing the web with Safari is a very unple=
asant
> >> experience on nfs, beachball every now and then together with a hu=
ge amount
> >> of network traffic (RX with 20MB/s+ peaks), not unusual to see sev=
eral
> >> gigabytes received after some minutes browsing, XCode shows a ''Th=
e
> >> document "SomeFile.m" could not be saved.''-error after some edits=
, Opera
> >> hangs for minutes when closing, etc etc.
> >>
> >> It's horrible :(
> >>
> >> Another example, extracting
> >> http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3min=
!
> >>
> >> $ time tar xzf Cocoa-3rd.tgz
> >> 0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
> >> $ time rm -rf Solutions-Cocoa-3rd/
> >> 0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w
> >>
> >> So any help or hints really appreciated
> >
> > So, no answers yet, but I did some more tests, i.e. I tried extract=
ing the
> > Cocoa-3rd.tgz (2.2MB, 12MB untar'ed) on FreeBSD 8.0-REL (running in=
side
> > VMWare though), and still it was much faster (5:51.10 vs 0:09.35) t=
han
> > extracting on bare metal fedora12:
> >
> > $ time tar xfz Cocoa-3rd.tgz
> > 0.104u 1.474s 0:09.35 16.7%	0+0k 0+4896io 0pf+0w
> > $ time rm -rf Solutions-Cocoa-3rd
> > 0.006u 0.160s 0:01.24 12.9%	0+0k 0+0io 0pf+0w
> >
> > I captured the nfs traffic with tcpdump (tcpdump -i eth1 -s 0 -w nf=
s.out
> > host nfssrv and port 2049) on both freebsd8 (interface for freebsd =
is a bit
> > different ofc) and fedora12 while running
> >
> > tar xfz Cocoa-3rd.tgz Solutions-Cocoa-3rd/02_GetStarted
> >
> > (which extracts just a couple of files) , you can find them here:
> >
> > Fedora 12 tcpdump ->  http://www.dpaste.org/5cvp/
> > FreeBSD 8 tcpdump ->  http://www.dpaste.org/uCGX/
>=20
> The number of packets is around 1800 for the FreeBSD server and aroun=
d=20
> 1940 for the Linux server.  The RPC counts you posted in a later emai=
l=20
> show that Linux does more LOOKUP and ACCESS requests.  But generally,=
 it=20
> looks like your client is doing roughly the same amount of work in bo=
th=20
> cases.
>=20
> But what catches my eye in the F12 tcpdump is that there are pauses=20
> where the server reply is delayed by a few milliseconds after a SETAT=
TR=20
> or COMMIT.  This looks normal, since disk writes can take a few=20
> milliseconds.
>=20
> FreeBSD doesn't appear to have these pauses, so I suspect FreeBSD is=20
> doing something illegal.  No NFS server can turn a SETATTR around in=20
> just a few microseconds and claim that it is on permanent storage,=20
> unless it has some kind of NVRAM.

=46irst of all, thanks for your answer Chuck :)

there are some additional packets because the .tgz was on the same
(nfs-mounted) dir and on a local dir during the freebsd test (so some e=
xtra
reads etc. sneaked in the linux tcpdump/nfsstat -s)

Just for the record, Solaris 10u8 (UFS) extraction time is almost the s=
ame as
=46reeBSDs:

$ time tar xfz Cocoa-3rd.tgz
0.111u 1.621s 0:09.03 19.1%	0+0k 8+4966io 0pf+0w

So I guess they're both cheating ;-)

Anyway, seems like I'm the only one with this problem on OS X (seeing 5=
min
extraction times and huge rx peaks, beachball etc. during normal use), =
so
thanks for your time

Cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-20 21:21     ` Stefan Krüger
@ 2010-04-20 21:40       ` Chuck Lever
  2010-04-20 22:44         ` Stefan Krüger
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2010-04-20 21:40 UTC (permalink / raw)
  To: Stefan Krüger; +Cc: linux-nfs

On 04/20/2010 05:21 PM, Stefan Kr=FCger wrote:
> On Mon, 19 Apr 2010, Chuck Lever wrote:
>
>> On 04/19/2010 08:21 AM, Stefan Kr=FCger wrote:
>>> On Thu, 15 Apr 2010, Stefan Kr=FCger wrote:
>>>
>>>> Hello list,
>>>>
>>>> I have some really strange nfs performance issues
>>>>
>>>> NFS server is Fedora 12, running
>>>> * kernel-2.6.32.11-99.fc12.x86_64 and
>>>> * nfs-utils-1.2.1-4.fc12.x86_64
>>>> * nfs shared /home is ext4 with default mount options
>>>>
>>>> /etc/exports:
>>>> /home   192.168.1.0/255.255.255.0(rw,sync)
>>>>
>>>> nfs and nfslock are up and running
>>>>
>>>> Nothing else touched on the server nfs-wise.
>>>>
>>>> NFS client is Mac OS X, version 10.6.3
>>>>
>>>> My /home dir is automounted on the Mac with the following mount op=
tions:
>>>> * nosuid,nodev,resvport,rdirplus,rwsize=3D1048576
>>>> (nfsv3 and tcp are default, I have also tried udp, and with and wi=
thout
>>>> rdirplus, with different read/write sizes (started with 32k, less =
for udp,
>>>> and then cranked it up to 1m to make the beachball appear less oft=
en),
>>>> but I still have issues no matter which options I chose)
>>>>
>>>> Anyway, I'm stuck now, surfing the web with Safari is a very unple=
asant
>>>> experience on nfs, beachball every now and then together with a hu=
ge amount
>>>> of network traffic (RX with 20MB/s+ peaks), not unusual to see sev=
eral
>>>> gigabytes received after some minutes browsing, XCode shows a ''Th=
e
>>>> document "SomeFile.m" could not be saved.''-error after some edits=
, Opera
>>>> hangs for minutes when closing, etc etc.
>>>>
>>>> It's horrible :(
>>>>
>>>> Another example, extracting
>>>> http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3min=
!
>>>>
>>>> $ time tar xzf Cocoa-3rd.tgz
>>>> 0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
>>>> $ time rm -rf Solutions-Cocoa-3rd/
>>>> 0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w
>>>>
>>>> So any help or hints really appreciated
>>>
>>> So, no answers yet, but I did some more tests, i.e. I tried extract=
ing the
>>> Cocoa-3rd.tgz (2.2MB, 12MB untar'ed) on FreeBSD 8.0-REL (running in=
side
>>> VMWare though), and still it was much faster (5:51.10 vs 0:09.35) t=
han
>>> extracting on bare metal fedora12:
>>>
>>> $ time tar xfz Cocoa-3rd.tgz
>>> 0.104u 1.474s 0:09.35 16.7%	0+0k 0+4896io 0pf+0w
>>> $ time rm -rf Solutions-Cocoa-3rd
>>> 0.006u 0.160s 0:01.24 12.9%	0+0k 0+0io 0pf+0w
>>>
>>> I captured the nfs traffic with tcpdump (tcpdump -i eth1 -s 0 -w nf=
s.out
>>> host nfssrv and port 2049) on both freebsd8 (interface for freebsd =
is a bit
>>> different ofc) and fedora12 while running
>>>
>>> tar xfz Cocoa-3rd.tgz Solutions-Cocoa-3rd/02_GetStarted
>>>
>>> (which extracts just a couple of files) , you can find them here:
>>>
>>> Fedora 12 tcpdump ->   http://www.dpaste.org/5cvp/
>>> FreeBSD 8 tcpdump ->   http://www.dpaste.org/uCGX/
>>
>> The number of packets is around 1800 for the FreeBSD server and arou=
nd
>> 1940 for the Linux server.  The RPC counts you posted in a later ema=
il
>> show that Linux does more LOOKUP and ACCESS requests.  But generally=
, it
>> looks like your client is doing roughly the same amount of work in b=
oth
>> cases.
>>
>> But what catches my eye in the F12 tcpdump is that there are pauses
>> where the server reply is delayed by a few milliseconds after a SETA=
TTR
>> or COMMIT.  This looks normal, since disk writes can take a few
>> milliseconds.
>>
>> FreeBSD doesn't appear to have these pauses, so I suspect FreeBSD is
>> doing something illegal.  No NFS server can turn a SETATTR around in
>> just a few microseconds and claim that it is on permanent storage,
>> unless it has some kind of NVRAM.
>
> First of all, thanks for your answer Chuck :)
>
> there are some additional packets because the .tgz was on the same
> (nfs-mounted) dir and on a local dir during the freebsd test (so some=
 extra
> reads etc. sneaked in the linux tcpdump/nfsstat -s)
>
> Just for the record, Solaris 10u8 (UFS) extraction time is almost the=
 same as
> FreeBSDs:
>
> $ time tar xfz Cocoa-3rd.tgz
> 0.111u 1.621s 0:09.03 19.1%	0+0k 8+4966io 0pf+0w
>
> So I guess they're both cheating ;-)

Is Solaris running in a guest too?

What physical file system are you using on the Linux NFS server?

--=20
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-20 21:40       ` Chuck Lever
@ 2010-04-20 22:44         ` Stefan Krüger
  2010-04-21 17:09           ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Krüger @ 2010-04-20 22:44 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Tue, 20 Apr 2010, Chuck Lever wrote:

> On 04/20/2010 05:21 PM, Stefan Kr=FCger wrote:
> > On Mon, 19 Apr 2010, Chuck Lever wrote:
> >
> >> On 04/19/2010 08:21 AM, Stefan Kr=FCger wrote:
> >>> On Thu, 15 Apr 2010, Stefan Kr=FCger wrote:
> >>>
> >>>> Hello list,
> >>>>
> >>>> I have some really strange nfs performance issues
> >>>>
> >>>> NFS server is Fedora 12, running
> >>>> * kernel-2.6.32.11-99.fc12.x86_64 and
> >>>> * nfs-utils-1.2.1-4.fc12.x86_64
> >>>> * nfs shared /home is ext4 with default mount options
> >>>>
> >>>> /etc/exports:
> >>>> /home   192.168.1.0/255.255.255.0(rw,sync)
> >>>>
> >>>> nfs and nfslock are up and running
> >>>>
> >>>> Nothing else touched on the server nfs-wise.
> >>>>
> >>>> NFS client is Mac OS X, version 10.6.3
> >>>>
> >>>> My /home dir is automounted on the Mac with the following mount =
options:
> >>>> * nosuid,nodev,resvport,rdirplus,rwsize=3D1048576
> >>>> (nfsv3 and tcp are default, I have also tried udp, and with and =
without
> >>>> rdirplus, with different read/write sizes (started with 32k, les=
s for udp,
> >>>> and then cranked it up to 1m to make the beachball appear less o=
ften),
> >>>> but I still have issues no matter which options I chose)
> >>>>
> >>>> Anyway, I'm stuck now, surfing the web with Safari is a very unp=
leasant
> >>>> experience on nfs, beachball every now and then together with a =
huge
> >>>> amount of network traffic (RX with 20MB/s+ peaks), not unusual t=
o see
> >>>> several gigabytes received after some minutes browsing, XCode sh=
ows a
> >>>> ''The document "SomeFile.m" could not be saved.''-error after so=
me
> >>>> edits, Opera hangs for minutes when closing, etc etc.
> >>>>
> >>>> It's horrible :(
> >>>>
> >>>> Another example, extracting
> >>>> http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3m=
in!
> >>>>
> >>>> $ time tar xzf Cocoa-3rd.tgz
> >>>> 0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
> >>>> $ time rm -rf Solutions-Cocoa-3rd/
> >>>> 0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w
> >>>>
> >>>> So any help or hints really appreciated
> >>>
> >>> So, no answers yet, but I did some more tests, i.e. I tried extra=
cting
> >>> the Cocoa-3rd.tgz (2.2MB, 12MB untar'ed) on FreeBSD 8.0-REL (runn=
ing
> >>> inside VMWare though), and still it was much faster (5:51.10 vs
> >>> 0:09.35) than extracting on bare metal fedora12:
> >>>
> >>> $ time tar xfz Cocoa-3rd.tgz
> >>> 0.104u 1.474s 0:09.35 16.7%	0+0k 0+4896io 0pf+0w
> >>> $ time rm -rf Solutions-Cocoa-3rd
> >>> 0.006u 0.160s 0:01.24 12.9%	0+0k 0+0io 0pf+0w
> >>>
> >>> I captured the nfs traffic with tcpdump (tcpdump -i eth1 -s 0 -w
> >>> nfs.out host nfssrv and port 2049) on both freebsd8 (interface fo=
r
> >>> freebsd is a bit different ofc) and fedora12 while running
> >>>
> >>> tar xfz Cocoa-3rd.tgz Solutions-Cocoa-3rd/02_GetStarted
> >>>
> >>> (which extracts just a couple of files) , you can find them here:
> >>>
> >>> Fedora 12 tcpdump ->   http://www.dpaste.org/5cvp/
> >>> FreeBSD 8 tcpdump ->   http://www.dpaste.org/uCGX/
> >>
> >> The number of packets is around 1800 for the FreeBSD server and ar=
ound
> >> 1940 for the Linux server.  The RPC counts you posted in a later e=
mail
> >> show that Linux does more LOOKUP and ACCESS requests.  But general=
ly, it
> >> looks like your client is doing roughly the same amount of work in=
 both
> >> cases.
> >>
> >> But what catches my eye in the F12 tcpdump is that there are pause=
s
> >> where the server reply is delayed by a few milliseconds after a SE=
TATTR
> >> or COMMIT.  This looks normal, since disk writes can take a few
> >> milliseconds.
> >>
> >> FreeBSD doesn't appear to have these pauses, so I suspect FreeBSD =
is
> >> doing something illegal.  No NFS server can turn a SETATTR around =
in
> >> just a few microseconds and claim that it is on permanent storage,
> >> unless it has some kind of NVRAM.
> >
> > First of all, thanks for your answer Chuck :)
> >
> > there are some additional packets because the .tgz was on the same
> > (nfs-mounted) dir and on a local dir during the freebsd test (so so=
me
> > extra reads etc. sneaked in the linux tcpdump/nfsstat -s)
> >
> > Just for the record, Solaris 10u8 (UFS) extraction time is almost t=
he
> > same as FreeBSDs:
> >
> > $ time tar xfz Cocoa-3rd.tgz
> > 0.111u 1.621s 0:09.03 19.1%	0+0k 8+4966io 0pf+0w
> >
> > So I guess they're both cheating ;-)
>=20
> Is Solaris running in a guest too?

Yes, this test was also done using VMWare
I also installed Fedora12/ext4 in VMWare and got ->

$ time tar xfz Cocoa-3rd.tgz
0.135u 2.151s 0:36.50 6.2%      0+0k 3+4770io 0pf+0w

Well, 36 seconds, still 3x longer than Solaris/FreeBSD but way better t=
han
5min on bare metal

Anyway, to get some real data I splitted my raid1 mirror on the nfs ser=
ver
yesterday and installed FreeBSD 8 on the (now) free disk
I don't know why but NFS performance is stunning, extracting the tgz on=
ly
takes seconds, Opera doesn't hang for minutes when closing, no strange =
rx
spikes and a lot less total bytes received on the same hardware with th=
e same
files (no special tweaks done, only change on freebsd was to start 8 nf=
sd
servers [default is 4], fs is UFS with softdeps)

I can't explain it...

> What physical file system are you using on the Linux NFS server?

See my first mail ("nfs shared /home is ext4 with default mount options=
"),
so it's ext4 (I did some tests with xfs/ext3 using fedora12/VMWare, too=
, same
bad results though)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-20 22:44         ` Stefan Krüger
@ 2010-04-21 17:09           ` Chuck Lever
  2010-04-22  1:17             ` Stefan Krüger
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2010-04-21 17:09 UTC (permalink / raw)
  To: Stefan Krüger; +Cc: linux-nfs

On Apr 20, 2010, at 6:44 PM, Stefan Krüger wrote:
> On Tue, 20 Apr 2010, Chuck Lever wrote:
> 
>> On 04/20/2010 05:21 PM, Stefan Krüger wrote:
>>> On Mon, 19 Apr 2010, Chuck Lever wrote:
>>> 
>>>> On 04/19/2010 08:21 AM, Stefan Krüger wrote:
>>>>> On Thu, 15 Apr 2010, Stefan Krüger wrote:
>>>>> 
>>>>>> Hello list,
>>>>>> 
>>>>>> I have some really strange nfs performance issues
>>>>>> 
>>>>>> NFS server is Fedora 12, running
>>>>>> * kernel-2.6.32.11-99.fc12.x86_64 and
>>>>>> * nfs-utils-1.2.1-4.fc12.x86_64
>>>>>> * nfs shared /home is ext4 with default mount options
>>>>>> 
>>>>>> /etc/exports:
>>>>>> /home   192.168.1.0/255.255.255.0(rw,sync)
>>>>>> 
>>>>>> nfs and nfslock are up and running
>>>>>> 
>>>>>> Nothing else touched on the server nfs-wise.
>>>>>> 
>>>>>> NFS client is Mac OS X, version 10.6.3
>>>>>> 
>>>>>> My /home dir is automounted on the Mac with the following mount options:
>>>>>> * nosuid,nodev,resvport,rdirplus,rwsize=1048576
>>>>>> (nfsv3 and tcp are default, I have also tried udp, and with and without
>>>>>> rdirplus, with different read/write sizes (started with 32k, less for udp,
>>>>>> and then cranked it up to 1m to make the beachball appear less often),
>>>>>> but I still have issues no matter which options I chose)
>>>>>> 
>>>>>> Anyway, I'm stuck now, surfing the web with Safari is a very unpleasant
>>>>>> experience on nfs, beachball every now and then together with a huge
>>>>>> amount of network traffic (RX with 20MB/s+ peaks), not unusual to see
>>>>>> several gigabytes received after some minutes browsing, XCode shows a
>>>>>> ''The document "SomeFile.m" could not be saved.''-error after some
>>>>>> edits, Opera hangs for minutes when closing, etc etc.
>>>>>> 
>>>>>> It's horrible :(
>>>>>> 
>>>>>> Another example, extracting
>>>>>> http://www.bignerdranch.com/solutions/Cocoa-3rd.tgz took over 3min!
>>>>>> 
>>>>>> $ time tar xzf Cocoa-3rd.tgz
>>>>>> 0.169u 3.198s 5:51.10 0.9%	0+0k 1+6972io 0pf+0w
>>>>>> $ time rm -rf Solutions-Cocoa-3rd/
>>>>>> 0.014u 0.477s 0:45.59 1.0%	0+0k 1+1io 0pf+0w
>>>>>> 
>>>>>> So any help or hints really appreciated
>>>>> 
>>>>> So, no answers yet, but I did some more tests, i.e. I tried extracting
>>>>> the Cocoa-3rd.tgz (2.2MB, 12MB untar'ed) on FreeBSD 8.0-REL (running
>>>>> inside VMWare though), and still it was much faster (5:51.10 vs
>>>>> 0:09.35) than extracting on bare metal fedora12:
>>>>> 
>>>>> $ time tar xfz Cocoa-3rd.tgz
>>>>> 0.104u 1.474s 0:09.35 16.7%	0+0k 0+4896io 0pf+0w
>>>>> $ time rm -rf Solutions-Cocoa-3rd
>>>>> 0.006u 0.160s 0:01.24 12.9%	0+0k 0+0io 0pf+0w
>>>>> 
>>>>> I captured the nfs traffic with tcpdump (tcpdump -i eth1 -s 0 -w
>>>>> nfs.out host nfssrv and port 2049) on both freebsd8 (interface for
>>>>> freebsd is a bit different ofc) and fedora12 while running
>>>>> 
>>>>> tar xfz Cocoa-3rd.tgz Solutions-Cocoa-3rd/02_GetStarted
>>>>> 
>>>>> (which extracts just a couple of files) , you can find them here:
>>>>> 
>>>>> Fedora 12 tcpdump ->   http://www.dpaste.org/5cvp/
>>>>> FreeBSD 8 tcpdump ->   http://www.dpaste.org/uCGX/
>>>> 
>>>> The number of packets is around 1800 for the FreeBSD server and around
>>>> 1940 for the Linux server.  The RPC counts you posted in a later email
>>>> show that Linux does more LOOKUP and ACCESS requests.  But generally, it
>>>> looks like your client is doing roughly the same amount of work in both
>>>> cases.
>>>> 
>>>> But what catches my eye in the F12 tcpdump is that there are pauses
>>>> where the server reply is delayed by a few milliseconds after a SETATTR
>>>> or COMMIT.  This looks normal, since disk writes can take a few
>>>> milliseconds.
>>>> 
>>>> FreeBSD doesn't appear to have these pauses, so I suspect FreeBSD is
>>>> doing something illegal.  No NFS server can turn a SETATTR around in
>>>> just a few microseconds and claim that it is on permanent storage,
>>>> unless it has some kind of NVRAM.
>>> 
>>> First of all, thanks for your answer Chuck :)
>>> 
>>> there are some additional packets because the .tgz was on the same
>>> (nfs-mounted) dir and on a local dir during the freebsd test (so some
>>> extra reads etc. sneaked in the linux tcpdump/nfsstat -s)
>>> 
>>> Just for the record, Solaris 10u8 (UFS) extraction time is almost the
>>> same as FreeBSDs:
>>> 
>>> $ time tar xfz Cocoa-3rd.tgz
>>> 0.111u 1.621s 0:09.03 19.1%	0+0k 8+4966io 0pf+0w
>>> 
>>> So I guess they're both cheating ;-)
>> 
>> Is Solaris running in a guest too?
> 
> Yes, this test was also done using VMWare
> I also installed Fedora12/ext4 in VMWare and got ->
> 
> $ time tar xfz Cocoa-3rd.tgz
> 0.135u 2.151s 0:36.50 6.2%      0+0k 3+4770io 0pf+0w
> 
> Well, 36 seconds, still 3x longer than Solaris/FreeBSD but way better than
> 5min on bare metal
> 
> Anyway, to get some real data I splitted my raid1 mirror on the nfs server
> yesterday and installed FreeBSD 8 on the (now) free disk
> I don't know why but NFS performance is stunning, extracting the tgz only
> takes seconds, Opera doesn't hang for minutes when closing, no strange rx
> spikes and a lot less total bytes received on the same hardware with the same
> files (no special tweaks done, only change on freebsd was to start 8 nfsd
> servers [default is 4], fs is UFS with softdeps)
> 
> I can't explain it...
> 
>> What physical file system are you using on the Linux NFS server?
> 
> See my first mail ("nfs shared /home is ext4 with default mount options"),
> so it's ext4 (I did some tests with xfs/ext3 using fedora12/VMWare, too, same
> bad results though)


So, VMware is probably not waiting for disk writes to complete, since NFS servers in VMware guests run faster than native.  Note to self: never put mission critical data on NFS servers running as VMware guests.

Still, the Linux NFS server has some kind of problem here when comparing apples to apples.  Does increasing the number of nfsd threads on the server help?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: strange performance issues with OS X 10.6 client
  2010-04-21 17:09           ` Chuck Lever
@ 2010-04-22  1:17             ` Stefan Krüger
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Krüger @ 2010-04-22  1:17 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Wed, 21 Apr 2010, Chuck Lever wrote:

> So, VMware is probably not waiting for disk writes to complete, since NFS
> servers in VMware guests run faster than native.  Note to self: never put
> mission critical data on NFS servers running as VMware guests.

yeah I only did this for testing...

> Still, the Linux NFS server has some kind of problem here when comparing
> apples to apples.  Does increasing the number of nfsd threads on the server
> help?

Increasing the number of nfsd threads did not help (I raised them from 8 to
32 but Fedora is still crawling when extracting the tgz file, nfsstat numbers
below are with 8 nfsd threads on both Fedora and FreeBSD)

So speaking of nfsstat, I've got some new statistics, this time I extracted
the whole tgz and not just a single directory (+ files) inside it (statistics
zeroed before ofc, both tests run on bare metal and on same hardware,
network, etc.)

Fedora 12:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
45960      0          0          0          0       

Server nfs v3:
null         getattr      setattr      lookup       access       readlink     
1         0% 641       1% 6885     14% 10373    22% 18336    39% 0         0% 
read         write        create       mkdir        symlink      mknod        
1917      4% 2552      5% 1999      4% 228       0% 0         0% 0         0% 
remove       rmdir        rename       link         readdir      readdirplus  
633       1% 0         0% 3         0% 0         0% 0         0% 3         0% 
fsstat       fsinfo       pathconf     commit       
21        0% 1         0% 1         0% 2366      5% 

FreeBSD 8:

Server Info:
  Getattr   Setattr    Lookup  Readlink      Read     Write    Create Remove
      628      6421      5287         3       703      2490      1991 629
   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus Access
        3         0         0       229         2         0         2 6526
    Mknod    Fsstat    Fsinfo  PathConf    Commit
        0         5         1         1      2402
Server Ret-Failed
             5267
Server Faults
            0
Server Cache Stats:
   Inprog      Idem  Non-idem    Misses
        0         0         0         0
Server Write Gathering:
 WriteOps  WriteRPC   Opsaved
     2490      2490         0

First thing that catches my eyes is Access, it's 3x higher on Fedora, next
would be Lookup and Read, both 2x higher

MacOS 10.6 nfsstat client statistics after extracting on Fedora 12:

Client Info:
RPC Counts:
  Getattr   Setattr    Lookup  Readlink      Read     Write    Create Remove
      640      6885     10380         0      1919      2553      1999 634
   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus Access
        3         0         0       228         1         0         3 18353
    Mknod    Fsstat    Fsinfo  PathConf    Commit
        0        21         1         1      2367
RPC Info:
 TimedOut   Invalid X Replies   Retries  Requests
        0         0         0         0     45989
Cache Info:
Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits Misses
    84529      5034      9021      9408      5235      1701       766 2553
BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses
        0         0         1         3         4         0


... after extracting on FreeBSD 8:

Client Info:
RPC Counts:
  Getattr   Setattr    Lookup  Readlink      Read     Write    Create Remove
      628      6421      5290         3       704      2490      1991 630
   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus Access
        3         0         0       229         3         0         2 6538
    Mknod    Fsstat    Fsinfo  PathConf    Commit
        0         5         1         1      2402
RPC Info:
 TimedOut   Invalid X Replies   Retries  Requests
        0         0         0         0     27342
Cache Info:
Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits Misses
    36508      1529      1817      4317       958       704       767 2490
BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses
        1         3         1         7         3         0

Well this mostly matches what we've seen in the nfs server statistics,
even though a lot more caching was done when using the Fedora nfs server

nfsstat manpage on OSX explains them as:
Attr Hits/Misses - Performance of the NFS file attribute cache.
Lkup Hits/Misses - Performance of the directory name lookup cache.
BioR Hits/Misses - Performance of block cache for reads.
BioW Hits/Misses - Performance of block cache for writes.
BioRL Hits/Misses - Performance of symbolic link cache.
BioD Hits/Misses - Performance of directory cache.
DirE Hits/Misses - Performance of directory offset cache.

I've also captured the traffic from both sessions with tcpdump, it's quite
huge though, 42MB for Fedora and 10MB for FreeBSD... let me know if you want
to take a look

TIA

PS: Please let me say again that extraction time using FreeBSD nfs is 10
*seconds* while on Fedora it's over 5 *minutes*...

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-04-22  1:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-15 21:49 strange performance issues with OS X 10.6 client Stefan Krüger
2010-04-19 12:21 ` Stefan Krüger
2010-04-19 16:10   ` Stefan Krüger
2010-04-19 16:59   ` Chuck Lever
2010-04-20 21:21     ` Stefan Krüger
2010-04-20 21:40       ` Chuck Lever
2010-04-20 22:44         ` Stefan Krüger
2010-04-21 17:09           ` Chuck Lever
2010-04-22  1:17             ` Stefan Krüger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.