linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.4.21, NFS v3, and 3com 920
@ 2003-07-22  5:42 Matthew Hunter
  2003-07-22 13:19 ` Valdis.Kletnieks
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Hunter @ 2003-07-22  5:42 UTC (permalink / raw)
  To: linux-kernel

Short version: 
Any known gotchas for 2.4.21 NFS v3 and/or a 3com 920?

Long version:
I'm trying to run the /home directory for a small network from a 
single fileserver via NFS.  The server is a 4-way PPro 200 on 
SCSI RAID5 using linux v2.4.21, and the client is a 2-way Athlon
(on a Tyan 2466, MPX chipset).  The network is a just 5 machines,
mostly idle, Fast Ethernet on a switch.

Everything was working wonderfully until I rebooted the client 
server, and ran into the gnome-over-NFS-has-gconf-problems 
problem.  Researched, and the fix said "Use NFSv3".  So I 
recompiled, and used NFSv3.

Except this didn't work.  At all.  

I could mount the directory as before, but any attempt to read 
from it takes forever and involves a large number of timeouts.  
By "forever" I mean hours just to run mutt and open up a mailbox.
It might eventually succeed.  Maybe.

No load is apparant on either server while this is happening.  
Occasionally, I'll see a the client report that the server timed 
out.  If I'm very patient, I'll eventually see a report that the 
server is "OK".

After much futzing around trying to isolate the problem, it turns
out I can trigger it by compiling with NFSv3 on (from the client 
machine; server always has v3).  NFS without v3 works, except for 
the gnome thing.  And NFS with v3 works if you use TCP, but it's 
slow.

So I'm thinking, maybe there's some kind of packet loss problem?

ifconfig doesn't show any errors or dropped packets.  It does 
show some overruns when running under NFS v3 TCP, but did not 
when running without TCP.

I decide to run some large file transfers via SCP outside of the 
NFS mount to see what those show.  Low and behold, the fancy new 
system can only receive at very low speeds -- a minimum of about 
40 KB/s, max of maybe 200, mostly 60-80 KB/s.  Weird -- but that 
shouldn't make NFS time out horribly by itself.  

Running more tests, it turns out the speed problem is isolated to 
the one machine, and only to *receiving* data.  Sending goes at 
8 M/s to other machines from the client machine.  Sending from 
any machine to the client machine is slowed down, not just from 
the server.

I can drop another ethernet card into the client machine and try 
that, and in fact, that's probably the next thing I'll do when I 
work on it further tomorrow evening.  But for now, does any of 
this sound familiar to anyone?  Is this particular 3com chipset 
known to be broken, or just not supported well?  Any possible 
software cause for the slowdown?

Motherboard information is here:
http://www.tyan.com/products/html/tigermpx.html

-- 
Matthew Hunter (matthew@infodancer.org)
Public Key: http://matthew.infodancer.org/public_key.txt
Homepage: http://matthew.infodancer.org/index.jsp
Politics: http://www.triggerfinger.org/index.jsp

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.4.21, NFS v3, and 3com 920
  2003-07-22  5:42 2.4.21, NFS v3, and 3com 920 Matthew Hunter
@ 2003-07-22 13:19 ` Valdis.Kletnieks
  2003-07-22 18:06   ` Samuel Flory
  0 siblings, 1 reply; 4+ messages in thread
From: Valdis.Kletnieks @ 2003-07-22 13:19 UTC (permalink / raw)
  To: Matthew Hunter; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1117 bytes --]

On Tue, 22 Jul 2003 00:42:45 CDT, Matthew Hunter <matthew@infodancer.org>  said:

> Running more tests, it turns out the speed problem is isolated to 
> the one machine, and only to *receiving* data.  Sending goes at 
> 8 M/s to other machines from the client machine.  Sending from 
> any machine to the client machine is slowed down, not just from 
> the server.

These symptoms sound suspiciously like a 100BaseT auto-negotiation
problem.  With some combinations of gear, if one end is set to auto-negotiate
and the other end is nailed to full/half duplex (sorry, can't remember which and
I've not my caffiene yet), things go horribly wrong and many packets
dissapear silently on transmission, forcing retransmit timeouts and bad
throughput.  Basically, you end up with one end thinking it's full duplex,
the other end at half - and if the full duplex side ever sends a packet while
the half side is sending, the packet's lost.

Try nailing the devices on both ends of the cat-5 to the same thing (full or
half).  This can of course be interesting if you have an unmanaged hub that
doesn't give you a choice...




[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.4.21, NFS v3, and 3com 920
  2003-07-22 13:19 ` Valdis.Kletnieks
@ 2003-07-22 18:06   ` Samuel Flory
  2003-07-22 18:58     ` Matthew Hunter
  0 siblings, 1 reply; 4+ messages in thread
From: Samuel Flory @ 2003-07-22 18:06 UTC (permalink / raw)
  To: Matthew Hunter; +Cc: Valdis.Kletnieks, linux-kernel

Valdis.Kletnieks@vt.edu wrote:

>On Tue, 22 Jul 2003 00:42:45 CDT, Matthew Hunter <matthew@infodancer.org>  said:
>
>  
>
>>Running more tests, it turns out the speed problem is isolated to 
>>the one machine, and only to *receiving* data.  Sending goes at 
>>8 M/s to other machines from the client machine.  Sending from 
>>any machine to the client machine is slowed down, not just from 
>>the server.
>>    
>>
>
>These symptoms sound suspiciously like a 100BaseT auto-negotiation
>problem.  With some combinations of gear, if one end is set to auto-negotiate
>and the other end is nailed to full/half duplex (sorry, can't remember which and
>I've not my caffiene yet), things go horribly wrong and many packets
>dissapear silently on transmission, forcing retransmit timeouts and bad
>throughput.  Basically, you end up with one end thinking it's full duplex,
>the other end at half - and if the full duplex side ever sends a packet while
>the half side is sending, the packet's lost.
>
>Try nailing the devices on both ends of the cat-5 to the same thing (full or
>half).  This can of course be interesting if you have an unmanaged hub that
>doesn't give you a choice...
>
>
>
>  
>

  You should be able to use mii-tool, or ethtool (one or both should 
work) to check the state your ethernet controller thinks it is set to, 
and change the settings.

[root@sflory cujo]# mii-tool -v eth0
eth0: negotiated 100baseTx-FD, link ok
  product info: vendor 00:aa:00, model 51 rev 0
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
[root@sflory cujo]# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: puag
        Wake-on: g
        Link detected: yes
[root@sflory cujo]#

-- 
Once you have their hardware. Never give it back.
(The First Rule of Hardware Acquisition)
Sam Flory  <sflory@rackable.com>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.4.21, NFS v3, and 3com 920
  2003-07-22 18:06   ` Samuel Flory
@ 2003-07-22 18:58     ` Matthew Hunter
  0 siblings, 0 replies; 4+ messages in thread
From: Matthew Hunter @ 2003-07-22 18:58 UTC (permalink / raw)
  To: linux-kernel

On Tue, Jul 22, 2003 at 11:06:59AM -0700, Samuel Flory <sflory@rackable.com> wrote:
> >Try nailing the devices on both ends of the cat-5 to the same thing (full 
> >or half).  This can of course be interesting if you have an 
> >unmanaged hub that doesn't give you a choice...
>  You should be able to use mii-tool, or ethtool (one or both should 
> work) to check the state your ethernet controller thinks it is set to, 
> and change the settings.

So far I've seen several people point to this, and I just now had 
the chance to test the advice.  Here are the results:

image:~# mii-tool -v eth0
eth0: negotiated 100baseTx-FD, link ok
  product info: vendor 00:10:5a, model 0 rev 0
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

That's the default.  OK, the hub thinks it's FD, the adapter 
thinks its FD.  Should be a match.  

Test with a large file transfer: 80 KB/s, about as expected (ie, 
the problem still exists.

Let's assume the hub is smoking something interesting and 
force HD.  (The hub is unmanaged, so I can't force it to do 
anything).

image:~# mii-tool --force=100baseTx-HD eth0         
image:~# mii-tool -v eth0
eth0: 100 Mbit, half duplex, link ok
  product info: vendor 00:10:5a, model 0 rev 0
  basic mode:   100 Mbit, half duplex
  basic status: link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control

OK, adapter forced to half duplex.

Test with a large file transfer -- no change, still about 80 
KB/s.

Let's try to autonegotiate for the same result...

image:~# mii-tool --reset eth0
resetting the transceiver...
image:~# mii-tool --advertise=100baseTx-HD eth0 
restarting autonegotiation...
image:~# mii-tool -v eth0
eth0: negotiated 100baseTx-HD, link ok
  product info: vendor 00:10:5a, model 0 rev 0
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-HD flow-control
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

OK, looks fine.  Test... no change.

I predict hardware swaps in my future when I get home.

Just for giggles, I'll try 10baseT.

image:~# mii-tool --reset eth0
resetting the transceiver...
image:~# mii-tool --advertise=10baseT-FD eth0 
restarting autonegotiation...
image:~# mii-tool -v eth0
eth0: negotiated 10baseT-FD, link ok
  product info: vendor 00:10:5a, model 0 rev 0
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  10baseT-FD flow-control
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

Low-and-behold, 1.1 MB/s!  

Note that this is supposedly a fast ethernet hub and a fast 
ethernet adapter.  The other hosts on the hub all think so.

I wonder if I'm plugged into a special port or something.  
I'll play with that when I'm near the hardware later on tonight.

Thanks for your help, all of you.  I think I have the answers 
that I wanted -- namely, it's probably not a kernel problem.  

I am unsure if this explains the NFS problem (ie, NFS breaks with 
v3 enabled), but since it works via tcp, I'm not of any mind to 
complain.  If anyone is interested, I can try without tcp but 
with the ethernet controller in better shape and see if I can 
still cause the same symptoms.

-- 
Matthew Hunter (matthew@infodancer.org)
Public Key: http://matthew.infodancer.org/public_key.txt
Homepage: http://matthew.infodancer.org/index.jsp
Politics: http://www.triggerfinger.org/index.jsp

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-07-22 18:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-22  5:42 2.4.21, NFS v3, and 3com 920 Matthew Hunter
2003-07-22 13:19 ` Valdis.Kletnieks
2003-07-22 18:06   ` Samuel Flory
2003-07-22 18:58     ` Matthew Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).