linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
       [not found] <20040816110000.1120.31256.Mailman@lists.us.dell.com>
@ 2004-08-16 11:51 ` Tetsuo Handa
  2004-08-16 21:38   ` David S. Miller
  0 siblings, 1 reply; 34+ messages in thread
From: Tetsuo Handa @ 2004-08-16 11:51 UTC (permalink / raw)
  To: davem, linux-kernel

Hello, David.

On Sun, 15 Aug 2004 14:59:52 -0700
David S. Miller wrote:
> On Sun, 15 Aug 2004 16:57:58 -0300
> Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
> 
> > On Sun, Aug 15, 2004 at 01:53:49AM +0900, Tetsuo Handa wrote:
> > > 
> > > I'm using tg3.o with DHCP and PXE boot environment
> > > and I updated from 2.4.26 to 2.4.27,
> > > but tg3.o became not working with IBM BladeCenter.
> > 
> > David Miller is the tg3 maintainer, he will help you.
> 
> Does manual IP configuration work?

'ifconfig eth0 192.168.0.40' and 'route add default gw 192.168.0.1' showed
no error messages, but 'ping' doesn't reply.

 From 2.4.26 till 2.4.27-rc3 were all OK.
This trouble happens with 2.4.27-rc4 and later.

--
I'm sorry, but since yesterday, I temporarily disabled a5497108@anet.ne.jp to reduce spams.
Please reply to dev_null@anet.ne.jp if you couldn't reply to a5497108@anet.ne.jp .
Thank you.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-16 11:51 ` TG3 doesn't work in kernel 2.4.27 (David S. Miller) Tetsuo Handa
@ 2004-08-16 21:38   ` David S. Miller
  2004-08-25 17:48     ` Mike Waychison
  0 siblings, 1 reply; 34+ messages in thread
From: David S. Miller @ 2004-08-16 21:38 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: linux-kernel

On Mon, 16 Aug 2004 20:51:03 +0900
Tetsuo Handa <a5497108@anet.ne.jp> wrote:

>  From 2.4.26 till 2.4.27-rc3 were all OK.
> This trouble happens with 2.4.27-rc4 and later.

It's Sun's buggy 5704 Fiber auto-negotiation changes.

Here is a hacky possible fix, can you try it?

===== drivers/net/tg3.c 1.190 vs edited =====
--- 1.190/drivers/net/tg3.c	2004-07-21 14:14:20 -07:00
+++ edited/drivers/net/tg3.c	2004-08-16 14:24:53 -07:00
@@ -5266,6 +5266,8 @@
 	tw32_f(MAC_LOW_WMARK_MAX_RX_FRAME, 2);
 
 	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704 &&
+	    !(tp->pci_chip_rev_id == CHIPREV_ID_5704_A0 ||
+	      tp->pci_chip_rev_id == CHIPREV_ID_5704_A1) &&
 	    tp->phy_id == PHY_ID_SERDES) {
 		/* Enable hardware link auto-negotiation */
 		u32 digctrl, txctrl;

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-16 21:38   ` David S. Miller
@ 2004-08-25 17:48     ` Mike Waychison
  2004-08-25 19:08       ` David S. Miller
  0 siblings, 1 reply; 34+ messages in thread
From: Mike Waychison @ 2004-08-25 17:48 UTC (permalink / raw)
  To: David S. Miller; +Cc: Tetsuo Handa, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David S. Miller wrote:
> On Mon, 16 Aug 2004 20:51:03 +0900
> Tetsuo Handa <a5497108@anet.ne.jp> wrote:
>
>
>> From 2.4.26 till 2.4.27-rc3 were all OK.
>>This trouble happens with 2.4.27-rc4 and later.
>
>
> It's Sun's buggy 5704 Fiber auto-negotiation changes.
>
> Here is a hacky possible fix, can you try it?

Tetsuo posted his lscpi -vv output and he has an A2.  The hardware
autoneg patch was written and tested against an A3.

Would it make sense to do (hand-edited):



===== drivers/net/tg3.c 1.190 vs edited =====
- --- 1.190/drivers/net/tg3.c	2004-07-21 14:14:20 -07:00
+++ edited/drivers/net/tg3.c	2004-08-16 14:24:53 -07:00
@@ -5266,6 +5266,7 @@
 	tw32_f(MAC_LOW_WMARK_MAX_RX_FRAME, 2);

 	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704 &&
+	    tp->pci_chip_rev_id == CHIPREV_ID_5704_A3 &&
 	    tp->phy_id == PHY_ID_SERDES) {
 		/* Enable hardware link auto-negotiation */
 		u32 digctrl, txctrl;


- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBLNEBdQs4kOxk3/MRAm0rAJwKKfpzuy3EoJuujODZwHPyg8oD/wCfTGiH
Xg4pbO71QGfZFKXGEkJH/IA=
=PA+X
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-25 17:48     ` Mike Waychison
@ 2004-08-25 19:08       ` David S. Miller
  2004-08-25 20:04         ` Mike Waychison
  0 siblings, 1 reply; 34+ messages in thread
From: David S. Miller @ 2004-08-25 19:08 UTC (permalink / raw)
  To: Mike Waychison; +Cc: a5497108, linux-kernel

On Wed, 25 Aug 2004 13:48:49 -0400
Mike Waychison <Michael.Waychison@Sun.COM> wrote:

> Tetsuo posted his lscpi -vv output and he has an A2.  The hardware
> autoneg patch was written and tested against an A3.
> 
> Would it make sense to do (hand-edited):

Not really.  The autoneg code in the bcm5700 driver works on
all revisions of the 5704 chipset.

If I can't get this working soon, I'm disabling it for all boards.
The software based fibre autoneg should work just fine for
everyone.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-25 19:08       ` David S. Miller
@ 2004-08-25 20:04         ` Mike Waychison
  2004-08-26  0:58           ` David S. Miller
  0 siblings, 1 reply; 34+ messages in thread
From: Mike Waychison @ 2004-08-25 20:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, Brian Somers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(removed Tetsuo from the as it appears the email address is stale)

David S. Miller wrote:
> On Wed, 25 Aug 2004 13:48:49 -0400
> Mike Waychison <Michael.Waychison@Sun.COM> wrote:
>
>
>>Tetsuo posted his lscpi -vv output and he has an A2.  The hardware
>>autoneg patch was written and tested against an A3.
>>
>>Would it make sense to do (hand-edited):
>
>
> Not really.  The autoneg code in the bcm5700 driver works on
> all revisions of the 5704 chipset.
>
> If I can't get this working soon, I'm disabling it for all boards.
> The software based fibre autoneg should work just fine for
> everyone.

If I understand it correctly, the problem we were seeing is that the
chip was getting framing errors in high-traffic scenarios.  Setting it
to use hardware autoneg made these errors disappear.  It's possible we
need some other work-around.. :\

Maybe Brian can better explain the issue at hand.

- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBLPDodQs4kOxk3/MRAqrQAJkB0o0SFVv4rJiKcbT9b9LdcVcOowCfWljW
3cCak9CVYaY8Ecj+0s0Cd+M=
=V2EG
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-25 20:04         ` Mike Waychison
@ 2004-08-26  0:58           ` David S. Miller
  2004-08-26 10:49             ` Brian Somers
  0 siblings, 1 reply; 34+ messages in thread
From: David S. Miller @ 2004-08-26  0:58 UTC (permalink / raw)
  To: Mike Waychison; +Cc: linux-kernel, Brian.Somers

On Wed, 25 Aug 2004 16:04:57 -0400
Mike Waychison <Michael.Waychison@Sun.COM> wrote:

> If I understand it correctly, the problem we were seeing is that the
> chip was getting framing errors in high-traffic scenarios.  Setting it
> to use hardware autoneg made these errors disappear.  It's possible we
> need some other work-around.. :\

So what rev 5704 chips were in Sun's Opteron boxes where you
saw the problem?  A0/A1 chips?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-26  0:58           ` David S. Miller
@ 2004-08-26 10:49             ` Brian Somers
  2004-08-26 19:37               ` David S. Miller
  2004-08-30 23:11               ` David S. Miller
  0 siblings, 2 replies; 34+ messages in thread
From: Brian Somers @ 2004-08-26 10:49 UTC (permalink / raw)
  To: David S. Miller; +Cc: Mike Waychison, linux-kernel

Hi,

David S. Miller wrote:
> On Wed, 25 Aug 2004 16:04:57 -0400
> Mike Waychison <Michael.Waychison@Sun.COM> wrote:
> 
> 
>>If I understand it correctly, the problem we were seeing is that the
>>chip was getting framing errors in high-traffic scenarios.  Setting it
>>to use hardware autoneg made these errors disappear.  It's possible we
>>need some other work-around.. :\
> 
> 
> So what rev 5704 chips were in Sun's Opteron boxes where you
> saw the problem?  A0/A1 chips?

First, forgive the lack of specifics here, I haven't got access to any
of the hardware in question right now...

The issue was actually seen in Sun's x86 blades - the B200x boxes.  They
were using A3 parts (2 of them per box == 4 interfaces), although a
comment was added with one of the last modifications (unrelated to the
autoneg stuff) that said it was tested on an A2 part, so I guess there
were a number of them about too.  The machine had a PCI-X bus running at
64bits and either 66 or 133MHz (I think it was 133MHz, but I may be
wrong!).

The issue was that the frame error count was being bumped and we were
losing traffic under load.  From what I can remember it averaged
something like one in 30000 packets being dropped, but it may have
been slighly less frequent than that.

After the hardware guys here had ruled out any cross-talk possibilities,
I talked to Broadcom about it and they suggested that we enable
hardware autoneg.  The reasoning was that when hw autoneg is enabled,
the chip has a completely different code path for incoming traffic
where it doesn't have to look inside every packet to see if it's a
negotiation frame.  This increased throughput enough to defeat the
framing errors we were seeing.

The changes I made were reviewed by Broadcom and they seemed happy
that hw autoneg was enabled for all 5704 silicon revisions...


It's a bit strange that it stopped working only with the latest 2.4
version of tg3 - I wonder if the driver's actually coming back with
``HW autoneg failed'' in this scenario?  Perhaps there's a compat
issue between the switch and the Broadcom hw autoneg engine?

Can we get this guy to try running an older version of tg3 to see
what change introduce the issue?

Another interesting point was that the guy who wrote the bcm driver
for Solaris had problems enabling hw autoneg.  AFAIR he said that
when he enabled it, MAC_STATUS_PCS_SYNCED never turned up again.  I
don't know if this issue was ever resolved, and he no longer works
for Sun.  I never saw this issue under Linux.

-- 
Brian Somers                                            Sun Microsystems
                                             Sparc House, Guillemont Park
Software Engineer - LSE                          Minley Road, Blackwater
Tel: +44 1252 421 263   Ext: 21263                    Camberley GU17 9QG


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-26 10:49             ` Brian Somers
@ 2004-08-26 19:37               ` David S. Miller
  2004-08-29  9:56                 ` Pekka Pietikainen
  2004-09-10 12:35                 ` Brian Somers
  2004-08-30 23:11               ` David S. Miller
  1 sibling, 2 replies; 34+ messages in thread
From: David S. Miller @ 2004-08-26 19:37 UTC (permalink / raw)
  To: Brian Somers; +Cc: Michael.Waychison, linux-kernel

On Thu, 26 Aug 2004 11:49:57 +0100
Brian Somers <brian.somers@sun.com> wrote:

> Can we get this guy to try running an older version of tg3 to see
> what change introduce the issue?

Brian, we already narrowed it down to exactly the hw autoneg
changes Sun wrote.  It breaks the IBM blades onboard 5704
fibre chips.  Reverting your change or disabling hw autoneg
in the new code both fix the problem.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-26 19:37               ` David S. Miller
@ 2004-08-29  9:56                 ` Pekka Pietikainen
  2004-09-10 12:35                 ` Brian Somers
  1 sibling, 0 replies; 34+ messages in thread
From: Pekka Pietikainen @ 2004-08-29  9:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: Brian Somers, Michael.Waychison, linux-kernel

On Thu, Aug 26, 2004 at 12:37:30PM -0700, David S. Miller wrote:
> On Thu, 26 Aug 2004 11:49:57 +0100
> Brian Somers <brian.somers@sun.com> wrote:
> 
> > Can we get this guy to try running an older version of tg3 to see
> > what change introduce the issue?
> 
> Brian, we already narrowed it down to exactly the hw autoneg
> changes Sun wrote.  It breaks the IBM blades onboard 5704
> fibre chips.  Reverting your change or disabling hw autoneg
> in the new code both fix the problem.
Just another datapoint, an IBM blade with

01:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704S Gigabit
Ethernet (rev 02)

01:00.1 Class 0200: 14e4:16a8 (rev 02)
        Subsystem: 1014:029c
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (16000ns min), Cache Line Size 08
        Interrupt: pin B routed to IRQ 185
        Region 0: Memory at fbfd0000 (64-bit, non-prefetchable)
        Region 2: Memory at fbfc0000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [40] PCI-X non-bridge device.
                Command: DPERE- ERO- RBC=2 OST=0
                Status: Bus=1 Dev=0 Func=1 64bit+ 133MHz+ SCD- USC-,
DC=simple, DMMRBC=2, DMOST=0, DMCRS=1, RSCEM-
        Capabilities: [48] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable+ DSel=0 DScale=1 PME-
        Capabilities: [50] Vital Product Data
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3
Enable-
                Address: 0000000100000000  Data: 5900

doesn't work with the hw autoneg stuff in fc2's 2.6.8-1.521, #if 0
around the

        if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5704 &&
            tp->phy_id == PHY_ID_SERDES) {
                /* Enable hardware link auto-negotiation */
		...
	} 

makes it work. So it looks like a A2 vs. A3 (or
PCI_SUBSYSTEM_VENDOR_IBM ;) ) thing.

Btw., a ethtool workaround would be appreciated or is that even possible? I
tried ethtool -s eth1 speed 1000 duplex full port fibre autoneg off without
luck. But that was over a java-based VNC thing run remotely over SSH port
forwarding that gets keyboard mappings wrong, so I didn't spend too much
time playing around :-)

-- 
Pekka Pietikainen

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-26 10:49             ` Brian Somers
  2004-08-26 19:37               ` David S. Miller
@ 2004-08-30 23:11               ` David S. Miller
  2004-09-03 19:12                 ` Paul Larson
  1 sibling, 1 reply; 34+ messages in thread
From: David S. Miller @ 2004-08-30 23:11 UTC (permalink / raw)
  To: Brian Somers; +Cc: Michael.Waychison, linux-kernel


Michael Chan at Broadcom spotted the bug.

Things are totally broken if the switch/hub does not support
autonegotiation.  Checking for the MAC_STATUS_SIGNAL_DET bit
in the tg3 polling timer fixes the problem.

This is probably why it worked for you and doesn't with the
IBM blades as blades are more likely to be connected to
non-autoneg'ing devices.

===== drivers/net/tg3.c 1.199 vs edited =====
--- 1.199/drivers/net/tg3.c	2004-08-18 19:52:35 -07:00
+++ edited/drivers/net/tg3.c	2004-08-30 15:08:07 -07:00
@@ -5602,7 +5602,8 @@
 				need_setup = 1;
 			}
 			if (! netif_carrier_ok(tp->dev) &&
-			    (mac_stat & MAC_STATUS_PCS_SYNCED)) {
+			    (mac_stat & (MAC_STATUS_PCS_SYNCED |
+					 MAC_STATUS_SIGNAL_DET))) {
 				need_setup = 1;
 			}
 			if (need_setup) {

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-30 23:11               ` David S. Miller
@ 2004-09-03 19:12                 ` Paul Larson
  2004-09-03 19:19                   ` Mike Waychison
  2004-09-03 20:08                   ` David S. Miller
  0 siblings, 2 replies; 34+ messages in thread
From: Paul Larson @ 2004-09-03 19:12 UTC (permalink / raw)
  To: David S. Miller; +Cc: Brian Somers, Michael.Waychison, lkml

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

I tried this patch alone on top of 2.6.9-rc1 and tg3 is still broken for
me on JS20 blades.  Was there another patch I should have applied in
conjunction with this?

Thanks,
Paul Larson

On Mon, 2004-08-30 at 18:11, David S. Miller wrote:
> Michael Chan at Broadcom spotted the bug.
> 
> Things are totally broken if the switch/hub does not support
> autonegotiation.  Checking for the MAC_STATUS_SIGNAL_DET bit
> in the tg3 polling timer fixes the problem.
> 
> This is probably why it worked for you and doesn't with the
> IBM blades as blades are more likely to be connected to
> non-autoneg'ing devices.
> 
> ===== drivers/net/tg3.c 1.199 vs edited =====
> --- 1.199/drivers/net/tg3.c	2004-08-18 19:52:35 -07:00
> +++ edited/drivers/net/tg3.c	2004-08-30 15:08:07 -07:00
> @@ -5602,7 +5602,8 @@
>  				need_setup = 1;
>  			}
>  			if (! netif_carrier_ok(tp->dev) &&
> -			    (mac_stat & MAC_STATUS_PCS_SYNCED)) {
> +			    (mac_stat & (MAC_STATUS_PCS_SYNCED |
> +					 MAC_STATUS_SIGNAL_DET))) {
>  				need_setup = 1;
>  			}
>  			if (need_setup) {
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 19:12                 ` Paul Larson
@ 2004-09-03 19:19                   ` Mike Waychison
  2004-09-03 20:18                     ` Roland Dreier
  2004-09-03 20:08                   ` David S. Miller
  1 sibling, 1 reply; 34+ messages in thread
From: Mike Waychison @ 2004-09-03 19:19 UTC (permalink / raw)
  To: Paul Larson; +Cc: David S. Miller, Brian Somers, lkml

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paul Larson wrote:
> I tried this patch alone on top of 2.6.9-rc1 and tg3 is still broken for
> me on JS20 blades.  Was there another patch I should have applied in
> conjunction with this?
>

Is this with or without autonegotiation enabled on the switch?

> Thanks,
> Paul Larson
>
> On Mon, 2004-08-30 at 18:11, David S. Miller wrote:
>
>>Michael Chan at Broadcom spotted the bug.
>>
>>Things are totally broken if the switch/hub does not support
>>autonegotiation.  Checking for the MAC_STATUS_SIGNAL_DET bit
>>in the tg3 polling timer fixes the problem.
>>
>>This is probably why it worked for you and doesn't with the
>>IBM blades as blades are more likely to be connected to
>>non-autoneg'ing devices.
>>
>>===== drivers/net/tg3.c 1.199 vs edited =====
>>--- 1.199/drivers/net/tg3.c	2004-08-18 19:52:35 -07:00
>>+++ edited/drivers/net/tg3.c	2004-08-30 15:08:07 -07:00
>>@@ -5602,7 +5602,8 @@
>> 				need_setup = 1;
>> 			}
>> 			if (! netif_carrier_ok(tp->dev) &&
>>-			    (mac_stat & MAC_STATUS_PCS_SYNCED)) {
>>+			    (mac_stat & (MAC_STATUS_PCS_SYNCED |
>>+					 MAC_STATUS_SIGNAL_DET))) {
>> 				need_setup = 1;
>> 			}
>> 			if (need_setup) {
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>>
>
>


- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBOMPcdQs4kOxk3/MRAoJiAJoCZV1AKTQcOiOz0jNX1eZq9ZkiYACfaYDc
lWGl0C2xVNRuPuaKqt8/J90=
=mWO4
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 19:12                 ` Paul Larson
  2004-09-03 19:19                   ` Mike Waychison
@ 2004-09-03 20:08                   ` David S. Miller
  1 sibling, 0 replies; 34+ messages in thread
From: David S. Miller @ 2004-09-03 20:08 UTC (permalink / raw)
  To: Paul Larson; +Cc: brian.somers, Michael.Waychison, linux-kernel

On Fri, 03 Sep 2004 14:12:58 -0500
Paul Larson <plars@linuxtestproject.org> wrote:

> I tried this patch alone on top of 2.6.9-rc1 and tg3 is still broken for
> me on JS20 blades.  Was there another patch I should have applied in
> conjunction with this?

Use current 2.6.9 which has all of the updates.
The driver should be version 3.9

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 19:19                   ` Mike Waychison
@ 2004-09-03 20:18                     ` Roland Dreier
  2004-09-03 20:30                       ` David S. Miller
  0 siblings, 1 reply; 34+ messages in thread
From: Roland Dreier @ 2004-09-03 20:18 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Paul Larson, David S. Miller, Brian Somers, lkml

    Paul> I tried this patch alone on top of 2.6.9-rc1 and tg3 is
    Paul> still broken for me on JS20 blades.  Was there another patch
    Paul> I should have applied in conjunction with this?

Me too -- I copied the latest BK tg3.c/tg3.h to my 2.6.8.1 tree and
tried it on my JS20 and it didn't work.  Unfortunately the JS20 blade
only has serial-over-LAN for the console, which also dies as soon as
tg3 gets loaded, so I'm not sure exactly what happened.

    Mike> Is this with or without autonegotiation enabled on the switch?

I believe that the internal ports of the BladeCenter switch are always
locked to full-duplex gigabit operation (ie no autoneg).  In the
switch management GUI, there is a pull-down menu for setting
Speed/Duplex of external ports, but for internal ports to the blades,
there is no menu (just a hard-coded display of 1000/Full).

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 20:18                     ` Roland Dreier
@ 2004-09-03 20:30                       ` David S. Miller
  2004-09-03 20:40                         ` Roland Dreier
  2004-09-03 23:24                         ` Roland Dreier
  0 siblings, 2 replies; 34+ messages in thread
From: David S. Miller @ 2004-09-03 20:30 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Michael.Waychison, plars, Brian.Somers, linux-kernel

On Fri, 03 Sep 2004 13:18:11 -0700
Roland Dreier <roland@topspin.com> wrote:

>     Paul> I tried this patch alone on top of 2.6.9-rc1 and tg3 is
>     Paul> still broken for me on JS20 blades.  Was there another patch
>     Paul> I should have applied in conjunction with this?
> 
> Me too -- I copied the latest BK tg3.c/tg3.h to my 2.6.8.1 tree and
> tried it on my JS20 and it didn't work.

What do you mean by "latest"?  If it doesn't indicate driver
version 3.9 it is not the latest.

Please make sure you try current sources, I've had nothing
but positive reports for IBM blades from people actually
using the correct current 3.9 driver.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 20:30                       ` David S. Miller
@ 2004-09-03 20:40                         ` Roland Dreier
  2004-09-03 23:24                         ` Roland Dreier
  1 sibling, 0 replies; 34+ messages in thread
From: Roland Dreier @ 2004-09-03 20:40 UTC (permalink / raw)
  To: David S. Miller; +Cc: Michael.Waychison, plars, Brian.Somers, linux-kernel

    David> What do you mean by "latest"?  If it doesn't indicate
    David> driver version 3.9 it is not the latest.

"latest" == pulled last night.  (And yes it says version 3.9)

    David> Please make sure you try current sources, I've had nothing
    David> but positive reports for IBM blades from people actually
    David> using the correct current 3.9 driver.

I'll give it another try -- it could also be my chassis which is a
little flaky.

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 20:30                       ` David S. Miller
  2004-09-03 20:40                         ` Roland Dreier
@ 2004-09-03 23:24                         ` Roland Dreier
  2004-09-07 18:33                           ` Jake Moilanen
  1 sibling, 1 reply; 34+ messages in thread
From: Roland Dreier @ 2004-09-03 23:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: Michael.Waychison, plars, Brian.Somers, linux-kernel

    David> Please make sure you try current sources, I've had nothing
    David> but positive reports for IBM blades from people actually
    David> using the correct current 3.9 driver.

I tried it with a full build of a BK tree pulled last night, and it
definitely didn't work.  Some relevant output:

    tg3.c:v3.9 (August 30, 2004)
    eth0: Tigon3 [partno(none) rev 2003 PHY(serdes)] (PCIX:133MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:1e:88:56
    eth0: HostTXDS[1] RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
    eth1: Tigon3 [partno(none) rev 2003 PHY(serdes)] (PCIX:133MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:1e:88:57
    eth1: HostTXDS[1] RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]

and then as soon as the init scripts try to bring up the interface:

    Setting up network interfaces:
        lo
        lo        IP address: 127.0.0.1/8                                 done
        dummy0
        dummy0    No configuration found for dummy0                       unused
        eth0      device: Broadcom Corporation NetXtreme BCM5
    system>
    system> console -T system:blade[11]
    SOL is not ready

(the last three lines are the management console taking over again
after the serial-over-LAN has died)

Just to be clear, I'm running a ppc64 kernel on a JS20 blade (dual PPC
970) with BCM5704S.  The HS20 blade (dual Xeon) also has a BCM5703X
but I haven't tried the latest driver on one of those yet.

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-03 23:24                         ` Roland Dreier
@ 2004-09-07 18:33                           ` Jake Moilanen
  2004-09-07 19:52                             ` Roland Dreier
  0 siblings, 1 reply; 34+ messages in thread
From: Jake Moilanen @ 2004-09-07 18:33 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David S. Miller, Michael.Waychison, plars, Brian.Somers, linux-kernel


>     Setting up network interfaces:
>         lo
>         lo        IP address: 127.0.0.1/8                                 done
>         dummy0
>         dummy0    No configuration found for dummy0                       unused
>         eth0      device: Broadcom Corporation NetXtreme BCM5
>     system>
>     system> console -T system:blade[11]
>     SOL is not ready

Whenever an adapter reset is done (eg ifconfig up) on the same adapter
that SoL is using, you'll lose SoL.  SoL usually comes back, although
I've not had much luck ever since the Sun auto negotiation patch went
in.  One fix/workaround to not losing your SoL connection is having the
network go only over eth1 (assuming you have two switch modules).

Thanks,
Jake

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-07 18:33                           ` Jake Moilanen
@ 2004-09-07 19:52                             ` Roland Dreier
  2004-09-08 12:34                               ` Jake Moilanen
  0 siblings, 1 reply; 34+ messages in thread
From: Roland Dreier @ 2004-09-07 19:52 UTC (permalink / raw)
  To: Jake Moilanen
  Cc: David S. Miller, Michael.Waychison, plars, Brian.Somers, linux-kernel

    Jake> Whenever an adapter reset is done (eg ifconfig up) on the
    Jake> same adapter that SoL is using, you'll lose SoL.  SoL
    Jake> usually comes back, although I've not had much luck ever
    Jake> since the Sun auto negotiation patch went in.  One
    Jake> fix/workaround to not losing your SoL connection is having
    Jake> the network go only over eth1 (assuming you have two switch
    Jake> modules).

Thanks -- unfortunately I only have one switch module :(

With the 3.9 tg3 driver, neither SoL nor the real network seems to
ever come back.  As far as I can tell, the network is dead (and
without SoL there's no way for me to see what happens to the kernel).

Have you had success with the latest tg3 on JS20?

 - R.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-07 19:52                             ` Roland Dreier
@ 2004-09-08 12:34                               ` Jake Moilanen
  2004-09-08 13:07                                 ` Anton Blanchard
  2004-09-08 13:55                                 ` Paul Larson
  0 siblings, 2 replies; 34+ messages in thread
From: Jake Moilanen @ 2004-09-08 12:34 UTC (permalink / raw)
  To: Roland Dreier
  Cc: David S. Miller, Michael.Waychison, plars, Brian.Somers, linux-kernel


> With the 3.9 tg3 driver, neither SoL nor the real network seems to
> ever come back.  As far as I can tell, the network is dead (and
> without SoL there's no way for me to see what happens to the kernel).
> 
> Have you had success with the latest tg3 on JS20?

I've had mixed results.  On some of my blades it never works.  On others
it will come up every third attempt or so.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-08 12:34                               ` Jake Moilanen
@ 2004-09-08 13:07                                 ` Anton Blanchard
  2004-09-13 22:48                                   ` David S. Miller
  2004-09-08 13:55                                 ` Paul Larson
  1 sibling, 1 reply; 34+ messages in thread
From: Anton Blanchard @ 2004-09-08 13:07 UTC (permalink / raw)
  To: Jake Moilanen
  Cc: Roland Dreier, David S. Miller, Michael.Waychison, plars,
	Brian.Somers, linux-kernel

 
> I've had mixed results.  On some of my blades it never works.  On others
> it will come up every third attempt or so.

2.6 BK as of 2 days ago wasnt working on my JS20 either. Ive been
meaning to look closer but havent had a chance yet.

Anton

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-08 12:34                               ` Jake Moilanen
  2004-09-08 13:07                                 ` Anton Blanchard
@ 2004-09-08 13:55                                 ` Paul Larson
  2004-09-10 16:00                                   ` Paul Larson
  1 sibling, 1 reply; 34+ messages in thread
From: Paul Larson @ 2004-09-08 13:55 UTC (permalink / raw)
  To: Jake Moilanen
  Cc: Roland Dreier, David S. Miller, Michael.Waychison, Brian.Somers, lkml

[-- Attachment #1: Type: text/plain, Size: 556 bytes --]

I've had no success on any of the blades or bladecenters I've tried it
on.

On Wed, 2004-09-08 at 07:34, Jake Moilanen wrote:
> > With the 3.9 tg3 driver, neither SoL nor the real network seems to
> > ever come back.  As far as I can tell, the network is dead (and
> > without SoL there's no way for me to see what happens to the kernel).
> > 
> > Have you had success with the latest tg3 on JS20?
> 
> I've had mixed results.  On some of my blades it never works.  On others
> it will come up every third attempt or so.
> 
> Thanks,
> Jake

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-08-26 19:37               ` David S. Miller
  2004-08-29  9:56                 ` Pekka Pietikainen
@ 2004-09-10 12:35                 ` Brian Somers
  2004-09-10 19:40                   ` Roland Dreier
  2004-09-10 20:53                   ` David S. Miller
  1 sibling, 2 replies; 34+ messages in thread
From: Brian Somers @ 2004-09-10 12:35 UTC (permalink / raw)
  To: David S. Miller; +Cc: Michael.Waychison, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

David S. Miller wrote:
> On Thu, 26 Aug 2004 11:49:57 +0100
> Brian Somers <brian.somers@sun.com> wrote:
> 
> 
>>Can we get this guy to try running an older version of tg3 to see
>>what change introduce the issue?
> 
> 
> Brian, we already narrowed it down to exactly the hw autoneg
> changes Sun wrote.  It breaks the IBM blades onboard 5704
> fibre chips.  Reverting your change or disabling hw autoneg
> in the new code both fix the problem.

The problem seems to be that autoneg is disabled on the IBM switches.
After disabling autoneg on the Sun shelf switches, I see the problem.
This patch fixes things by reverting to sw autoneg which defaults to
a 1000Mbps/full-duplex link but with no flow control when it fails
(IBM should really have autoneg enabled!) - I'd appreciate it if
someone could test this against an IBM blade.

The patch is against the 2.6.8-1.521 version of tg3.c but should
hopefully apply to other recent versions.  If there are problems,
because tw32_f() isn't defined, change

     tw32_f(x, y);

to

     tw32(x, y);
     tr32(x);

Cheers.

-- 
Brian Somers                                            Sun Microsystems
                                             Sparc House, Guillemont Park
Software Engineer - LSE                          Minley Road, Blackwater
Tel: +44 1252 421 263   Ext: 21263                    Camberley GU17 9QG

[-- Attachment #2: tg3.c.patch --]
[-- Type: text/plain, Size: 1030 bytes --]

--- tg3.c.orig	2004-09-10 13:24:28.000000000 +0100
+++ tg3.c	2004-09-10 13:24:14.000000000 +0100
@@ -2051,9 +2051,25 @@
 				break;
 			udelay(1);
 		}
-		if (tick >= 195000)
-			printk(KERN_INFO PFX "%s: HW autoneg failed !\n",
+		if (tick >= 195000) {
+			u32 digctrl, txctrl;
+
+			printk(KERN_INFO PFX
+			    "%s: HW autoneg failed - disabled\n",
 			    tp->dev->name);
+
+			digctrl = tr32(SG_DIG_CTRL);
+			digctrl &= ~SG_DIG_USING_HW_AUTONEG;
+
+			txctrl = tr32(MAC_SERDES_CFG);
+			txctrl &= ~MAC_SERDES_CFG_EDGE_SELECT;
+			tw32_f(MAC_SERDES_CFG, txctrl);
+			tw32_f(SG_DIG_CTRL, digctrl | SG_DIG_SOFT_RESET);
+			udelay(5);
+			tw32_f(SG_DIG_CTRL, digctrl);
+
+			tp->tg3_flags2 &= ~TG3_FLG2_HW_AUTONEG;
+		}
 	}
 
 	/* Reset when initting first time or we have a link. */
@@ -5280,7 +5296,6 @@
 		txctrl = tr32(MAC_SERDES_CFG);
 		tw32_f(MAC_SERDES_CFG, txctrl | MAC_SERDES_CFG_EDGE_SELECT);
 		tw32_f(SG_DIG_CTRL, digctrl | SG_DIG_SOFT_RESET);
-		tr32(SG_DIG_CTRL);
 		udelay(5);
 		tw32_f(SG_DIG_CTRL, digctrl);
 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-08 13:55                                 ` Paul Larson
@ 2004-09-10 16:00                                   ` Paul Larson
  0 siblings, 0 replies; 34+ messages in thread
From: Paul Larson @ 2004-09-10 16:00 UTC (permalink / raw)
  To: Jake Moilanen
  Cc: Roland Dreier, David S. Miller, Michael.Waychison, Brian.Somers, lkml

I just realized that I forgot to mention, the latest kernel I tried on
with 2.6.9-rc1-bk15, which was still broken, network doesn't work. 
However, the autoneg failed messages in dmesg were gone.  I tried
multiple reboots to see if it would work/not work at random and I never
saw it work even once.

Thanks,
Paul Larson

On Wed, 2004-09-08 at 08:55, Paul Larson wrote:
> I've had no success on any of the blades or bladecenters I've tried it
> on.
> 
> On Wed, 2004-09-08 at 07:34, Jake Moilanen wrote:
> > > With the 3.9 tg3 driver, neither SoL nor the real network seems to
> > > ever come back.  As far as I can tell, the network is dead (and
> > > without SoL there's no way for me to see what happens to the kernel).
> > > 
> > > Have you had success with the latest tg3 on JS20?
> > 
> > I've had mixed results.  On some of my blades it never works.  On others
> > it will come up every third attempt or so.
> > 
> > Thanks,
> > Jake


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-10 12:35                 ` Brian Somers
@ 2004-09-10 19:40                   ` Roland Dreier
  2004-09-10 20:53                   ` David S. Miller
  1 sibling, 0 replies; 34+ messages in thread
From: Roland Dreier @ 2004-09-10 19:40 UTC (permalink / raw)
  To: Brian Somers; +Cc: David S. Miller, Michael.Waychison, linux-kernel

    Brian> The problem seems to be that autoneg is disabled on the IBM
    Brian> switches.  After disabling autoneg on the Sun shelf
    Brian> switches, I see the problem.  This patch fixes things by
    Brian> reverting to sw autoneg which defaults to a
    Brian> 1000Mbps/full-duplex link but with no flow control when it
    Brian> fails (IBM should really have autoneg enabled!) - I'd
    Brian> appreciate it if someone could test this against an IBM
    Brian> blade.

Yes, 2.6.8.1 with this patch works for me on a JS20.  There is a pause
of maybe 20 seconds when the interface is brough up (while hardware
autoneg times out?) and then the network comes up (and serial-over-lan
recovers).

Unfortunately this patch doesn't apply to the latest bk (3.9) tg3 driver.

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-10 12:35                 ` Brian Somers
  2004-09-10 19:40                   ` Roland Dreier
@ 2004-09-10 20:53                   ` David S. Miller
  2004-09-10 21:05                     ` Roland Dreier
  2004-09-10 22:14                     ` Brian Somers
  1 sibling, 2 replies; 34+ messages in thread
From: David S. Miller @ 2004-09-10 20:53 UTC (permalink / raw)
  To: Brian Somers; +Cc: Michael.Waychison, linux-kernel

On Fri, 10 Sep 2004 13:35:14 +0100
Brian Somers <brian.somers@sun.com> wrote:

> The problem seems to be that autoneg is disabled on the IBM switches.
> After disabling autoneg on the Sun shelf switches, I see the problem.
> This patch fixes things by reverting to sw autoneg which defaults to
> a 1000Mbps/full-duplex link but with no flow control when it fails
> (IBM should really have autoneg enabled!) - I'd appreciate it if
> someone could test this against an IBM blade.

Did you see the fix I posted the other day and have
already merged upstream?

The real problem was the MAC_STATUS register checking in
tg3_timer() that we use to determine if we should call
the PHY code.  Specifically, we were failing to test
MAC_STATUS_SIGNAL_DET being set, which when trying to
bring the link up means we should call tg3_setup_phy().

There are still some nagging problems with certain blades even
with my current code.  Brian, if you want to help I'd really
appreciate it if you worked with current tg3 sources as I rewrote
the 5704 hw autoneg support from scratch since it was missing
a hw bug workaround and had other issues as well.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-10 20:53                   ` David S. Miller
@ 2004-09-10 21:05                     ` Roland Dreier
  2004-09-10 21:45                       ` David S. Miller
  2004-09-10 22:14                     ` Brian Somers
  1 sibling, 1 reply; 34+ messages in thread
From: Roland Dreier @ 2004-09-10 21:05 UTC (permalink / raw)
  To: David S. Miller; +Cc: Brian Somers, Michael.Waychison, linux-kernel

    David> The real problem was the MAC_STATUS register checking in
    David> tg3_timer() that we use to determine if we should call the
    David> PHY code.  Specifically, we were failing to test
    David> MAC_STATUS_SIGNAL_DET being set, which when trying to bring
    David> the link up means we should call tg3_setup_phy().

    David> There are still some nagging problems with certain blades
    David> even with my current code.  Brian, if you want to help I'd
    David> really appreciate it if you worked with current tg3 sources
    David> as I rewrote the 5704 hw autoneg support from scratch since
    David> it was missing a hw bug workaround and had other issues as
    David> well.

Hmm... for what it's worth, Brian's patch against 2.6.8.1 works on my
JS20 blade, and the latest BK tg3 code doesn't.

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-10 21:05                     ` Roland Dreier
@ 2004-09-10 21:45                       ` David S. Miller
  0 siblings, 0 replies; 34+ messages in thread
From: David S. Miller @ 2004-09-10 21:45 UTC (permalink / raw)
  To: Roland Dreier; +Cc: brian.somers, Michael.Waychison, linux-kernel

On Fri, 10 Sep 2004 14:05:03 -0700
Roland Dreier <roland@topspin.com> wrote:

>     David> The real problem was the MAC_STATUS register checking in
>     David> tg3_timer() that we use to determine if we should call the
>     David> PHY code.  Specifically, we were failing to test
>     David> MAC_STATUS_SIGNAL_DET being set, which when trying to bring
>     David> the link up means we should call tg3_setup_phy().
> 
>     David> There are still some nagging problems with certain blades
>     David> even with my current code.  Brian, if you want to help I'd
>     David> really appreciate it if you worked with current tg3 sources
>     David> as I rewrote the 5704 hw autoneg support from scratch since
>     David> it was missing a hw bug workaround and had other issues as
>     David> well.
> 
> Hmm... for what it's worth, Brian's patch against 2.6.8.1 works on my
> JS20 blade, and the latest BK tg3 code doesn't.

Ok, some debugging to do. :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-10 20:53                   ` David S. Miller
  2004-09-10 21:05                     ` Roland Dreier
@ 2004-09-10 22:14                     ` Brian Somers
  1 sibling, 0 replies; 34+ messages in thread
From: Brian Somers @ 2004-09-10 22:14 UTC (permalink / raw)
  To: David S. Miller; +Cc: Michael.Waychison, linux-kernel

David S. Miller wrote:
> On Fri, 10 Sep 2004 13:35:14 +0100
> Brian Somers <brian.somers@sun.com> wrote:
> 
> 
>>The problem seems to be that autoneg is disabled on the IBM switches.
>>After disabling autoneg on the Sun shelf switches, I see the problem.
>>This patch fixes things by reverting to sw autoneg which defaults to
>>a 1000Mbps/full-duplex link but with no flow control when it fails
>>(IBM should really have autoneg enabled!) - I'd appreciate it if
>>someone could test this against an IBM blade.
> 
> 
> Did you see the fix I posted the other day and have
> already merged upstream?
> 
> The real problem was the MAC_STATUS register checking in
> tg3_timer() that we use to determine if we should call
> the PHY code.  Specifically, we were failing to test
> MAC_STATUS_SIGNAL_DET being set, which when trying to
> bring the link up means we should call tg3_setup_phy().

To be honest, when I saw your mail about that change, I was happy
to down tools as the problem was clearly fixed.  At that point I
already had suspicions that the optimisations in this area may
have issues.

But after a few more days, all the IBM blade folks were still
saying they were having problems - and then Mike W gave me a
kick ;*P

I think the issue with the code up 'till now is that when HW
autoneg fails, the driver just hangs about waiting for the
hardware to do something - in my previous testing here, the
switch would eventually recover (my only way of breaking the
switch was to drop it to the monitor prompt or reload it),
and at that point tg3 picks up the link status change and
everything's rosy.

> There are still some nagging problems with certain blades even
> with my current code.  Brian, if you want to help I'd really
> appreciate it if you worked with current tg3 sources as I rewrote
> the 5704 hw autoneg support from scratch since it was missing
> a hw bug workaround and had other issues as well.
> 
> Thanks.

Yes, I really ought to be running a current box, but for various
reasons I've been quite short of hardware for the past couple of
months.  I now have a lab again, but it's not yet turned on, so
I'm still scrounging hardware from people...

Feeble excuses... but they're the only ones I have :-/

-- 
Brian Somers                                            Sun Microsystems
                                             Sparc House, Guillemont Park
Software Engineer - LSE                          Minley Road, Blackwater
Tel: +44 1252 421 263   Ext: 21263                    Camberley GU17 9QG

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-08 13:07                                 ` Anton Blanchard
@ 2004-09-13 22:48                                   ` David S. Miller
  2004-09-14 22:20                                     ` Mike Waychison
  0 siblings, 1 reply; 34+ messages in thread
From: David S. Miller @ 2004-09-13 22:48 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: moilanen, roland, Michael.Waychison, plars, Brian.Somers, linux-kernel

On Wed, 8 Sep 2004 23:07:28 +1000
Anton Blanchard <anton@samba.org> wrote:

>  
> > I've had mixed results.  On some of my blades it never works.  On others
> > it will come up every third attempt or so.
> 
> 2.6 BK as of 2 days ago wasnt working on my JS20 either. Ive been
> meaning to look closer but havent had a chance yet.

Are you going to work on this soon Anton?  I will cook up some
debugging patches, this bug sucks and I want to fix it soon.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-13 22:48                                   ` David S. Miller
@ 2004-09-14 22:20                                     ` Mike Waychison
  2004-09-14 22:36                                       ` David S. Miller
                                                         ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Mike Waychison @ 2004-09-14 22:20 UTC (permalink / raw)
  To: David S. Miller
  Cc: Anton Blanchard, moilanen, roland, plars, Brian.Somers, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1338 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David S. Miller wrote:
> On Wed, 8 Sep 2004 23:07:28 +1000
> Anton Blanchard <anton@samba.org> wrote:
>
>
>>
>>
>>>I've had mixed results.  On some of my blades it never works.  On others
>>>it will come up every third attempt or so.
>>
>>2.6 BK as of 2 days ago wasnt working on my JS20 either. Ive been
>>meaning to look closer but havent had a chance yet.
>
>
> Are you going to work on this soon Anton?  I will cook up some
> debugging patches, this bug sucks and I want to fix it soon.

I've gone through the changes you've made lately and I found a thinko,
patch attached.

With this patch, I can turn off autoneg on our b1600's switch and the
b200x falls back to 1000FD as required.

Signed-Off: Mike Waychison <michael.waychison@sun.com>

- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBR26QdQs4kOxk3/MRAoCrAJ95xamjKjB1gSnNa63PrncjvHEfWwCghxkJ
UOQQ0P+4kc/FnbwfeXEaGHA=
=G6li
-----END PGP SIGNATURE-----

[-- Attachment #2: tg3-hw-autoneg-fallback.patch --]
[-- Type: text/x-patch, Size: 890 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.2191  -> 1.2192 
#	   drivers/net/tg3.c	1.203   -> 1.204  
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/09/14	root@lnx42.localdomain	1.2192
# tg3.c:
#   - fixed small thinko for hw autoneg fallback to 1000FD
# --------------------------------------------
#
diff -Nru a/drivers/net/tg3.c b/drivers/net/tg3.c
--- a/drivers/net/tg3.c	Tue Sep 14 22:13:16 2004
+++ b/drivers/net/tg3.c	Tue Sep 14 22:13:16 2004
@@ -2168,7 +2168,7 @@
 					else
 						val |= 0x4010880;
 
-					tw32_f(MAC_SERDES_CFG, serdes_cfg);
+					tw32_f(MAC_SERDES_CFG, val);
 				}
 
 				tw32_f(SG_DIG_CTRL, 0x01388400);

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-14 22:20                                     ` Mike Waychison
@ 2004-09-14 22:36                                       ` David S. Miller
  2004-09-14 22:58                                       ` Jake Moilanen
  2004-09-15  0:34                                       ` Roland Dreier
  2 siblings, 0 replies; 34+ messages in thread
From: David S. Miller @ 2004-09-14 22:36 UTC (permalink / raw)
  To: Mike Waychison; +Cc: anton, moilanen, roland, plars, Brian.Somers, linux-kernel

On Tue, 14 Sep 2004 18:20:02 -0400
Mike Waychison <Michael.Waychison@Sun.COM> wrote:

> I've gone through the changes you've made lately and I found a thinko,
> patch attached.
> 
> With this patch, I can turn off autoneg on our b1600's switch and the
> b200x falls back to 1000FD as required.
> 
> Signed-Off: Mike Waychison <michael.waychison@sun.com>

Thanks Mike, come up to SF and I'll buy you a round
or two. :-)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-14 22:20                                     ` Mike Waychison
  2004-09-14 22:36                                       ` David S. Miller
@ 2004-09-14 22:58                                       ` Jake Moilanen
  2004-09-15  0:34                                       ` Roland Dreier
  2 siblings, 0 replies; 34+ messages in thread
From: Jake Moilanen @ 2004-09-14 22:58 UTC (permalink / raw)
  To: Mike Waychison
  Cc: David S. Miller, Anton Blanchard, roland, plars, Brian.Somers,
	linux-kernel


> I've gone through the changes you've made lately and I found a thinko,
> patch attached.
> 
> With this patch, I can turn off autoneg on our b1600's switch and the
> b200x falls back to 1000FD as required.
> 
> Signed-Off: Mike Waychison <michael.waychison@sun.com>
> 

This is working on my JS20.   Nice work Mike.

Jake

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)
  2004-09-14 22:20                                     ` Mike Waychison
  2004-09-14 22:36                                       ` David S. Miller
  2004-09-14 22:58                                       ` Jake Moilanen
@ 2004-09-15  0:34                                       ` Roland Dreier
  2 siblings, 0 replies; 34+ messages in thread
From: Roland Dreier @ 2004-09-15  0:34 UTC (permalink / raw)
  To: Mike Waychison
  Cc: David S. Miller, Anton Blanchard, moilanen, plars, Brian.Somers,
	linux-kernel

Works on my JS20 as well.  Thanks!

 - Roland

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2004-09-15  0:34 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20040816110000.1120.31256.Mailman@lists.us.dell.com>
2004-08-16 11:51 ` TG3 doesn't work in kernel 2.4.27 (David S. Miller) Tetsuo Handa
2004-08-16 21:38   ` David S. Miller
2004-08-25 17:48     ` Mike Waychison
2004-08-25 19:08       ` David S. Miller
2004-08-25 20:04         ` Mike Waychison
2004-08-26  0:58           ` David S. Miller
2004-08-26 10:49             ` Brian Somers
2004-08-26 19:37               ` David S. Miller
2004-08-29  9:56                 ` Pekka Pietikainen
2004-09-10 12:35                 ` Brian Somers
2004-09-10 19:40                   ` Roland Dreier
2004-09-10 20:53                   ` David S. Miller
2004-09-10 21:05                     ` Roland Dreier
2004-09-10 21:45                       ` David S. Miller
2004-09-10 22:14                     ` Brian Somers
2004-08-30 23:11               ` David S. Miller
2004-09-03 19:12                 ` Paul Larson
2004-09-03 19:19                   ` Mike Waychison
2004-09-03 20:18                     ` Roland Dreier
2004-09-03 20:30                       ` David S. Miller
2004-09-03 20:40                         ` Roland Dreier
2004-09-03 23:24                         ` Roland Dreier
2004-09-07 18:33                           ` Jake Moilanen
2004-09-07 19:52                             ` Roland Dreier
2004-09-08 12:34                               ` Jake Moilanen
2004-09-08 13:07                                 ` Anton Blanchard
2004-09-13 22:48                                   ` David S. Miller
2004-09-14 22:20                                     ` Mike Waychison
2004-09-14 22:36                                       ` David S. Miller
2004-09-14 22:58                                       ` Jake Moilanen
2004-09-15  0:34                                       ` Roland Dreier
2004-09-08 13:55                                 ` Paul Larson
2004-09-10 16:00                                   ` Paul Larson
2004-09-03 20:08                   ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).