linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Gianfar tx-babbling-errors
@ 2009-02-18 16:16 Scott Coulter
  2009-02-18 17:03 ` Kumar Gala
  2009-02-20 19:16 ` Haruki Dai-R35557
  0 siblings, 2 replies; 16+ messages in thread
From: Scott Coulter @ 2009-02-18 16:16 UTC (permalink / raw)
  To: linuxppc-dev


Hi all,

As a simple stress test for my board with an MPC8572E and an MPC8568E on
it, I setup both processors to boot linux 2.6.27.6 with an NFS root and
then perform repeated native compiles of a linux kernel over NFS.  After
running for 4 days straight or so with between 250-300 build cycles per
processor, I stopped the builds and ran ethtool to look for any odd
statistics.  Both processors reported non-zero values for
tx-babbling-errors.  Both processors reported around 1300
tx-babbling-errors out of about 80,000,000 Tx packets.  Should I be
concerned about the tx-babbling-errors?  What conditions would cause
these errors to be reported?

Thanks,
Scott





___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Gianfar tx-babbling-errors
  2009-02-18 16:16 Gianfar tx-babbling-errors Scott Coulter
@ 2009-02-18 17:03 ` Kumar Gala
  2009-02-18 17:22   ` Scott Coulter
  2009-02-20 19:16 ` Haruki Dai-R35557
  1 sibling, 1 reply; 16+ messages in thread
From: Kumar Gala @ 2009-02-18 17:03 UTC (permalink / raw)
  To: Scott Coulter; +Cc: linuxppc-dev


On Feb 18, 2009, at 10:16 AM, Scott Coulter wrote:

>
> Hi all,
>
> As a simple stress test for my board with an MPC8572E and an  
> MPC8568E on
> it, I setup both processors to boot linux 2.6.27.6 with an NFS root  
> and
> then perform repeated native compiles of a linux kernel over NFS.   
> After
> running for 4 days straight or so with between 250-300 build cycles  
> per
> processor, I stopped the builds and ran ethtool to look for any odd
> statistics.  Both processors reported non-zero values for
> tx-babbling-errors.  Both processors reported around 1300
> tx-babbling-errors out of about 80,000,000 Tx packets.  Should I be
> concerned about the tx-babbling-errors?  What conditions would cause
> these errors to be reported?
>
> Thanks,
> Scott

I'm told this will occur when:

Transmitted frame > MAXFRM and MACCFG2[Huge En] = 0.

- k

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-18 17:03 ` Kumar Gala
@ 2009-02-18 17:22   ` Scott Coulter
  2009-02-18 17:29     ` Kumar Gala
  0 siblings, 1 reply; 16+ messages in thread
From: Scott Coulter @ 2009-02-18 17:22 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev

Kumar,

> I'm told this will occur when:

> Transmitted frame > MAXFRM and MACCFG2[Huge En] =3D 0.

In the driver it looks like the MACCFG2_HUGEFRAME only gets set if the
mtu > DEFAULT_RX_BUFFER_SIZE (1536 in my kernel).  It appears as though
the mtu is set to 1500.  Under what conditions would the driver attempt
to send a frame larger than the mtu?

Scott

___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Gianfar tx-babbling-errors
  2009-02-18 17:22   ` Scott Coulter
@ 2009-02-18 17:29     ` Kumar Gala
  2009-02-18 20:45       ` Scott Coulter
  2009-02-19 16:17       ` Scott Coulter
  0 siblings, 2 replies; 16+ messages in thread
From: Kumar Gala @ 2009-02-18 17:29 UTC (permalink / raw)
  To: Scott Coulter; +Cc: linuxppc-dev


On Feb 18, 2009, at 11:22 AM, Scott Coulter wrote:

> Kumar,
>
>> I'm told this will occur when:
>
>> Transmitted frame > MAXFRM and MACCFG2[Huge En] = 0.
>
> In the driver it looks like the MACCFG2_HUGEFRAME only gets set if the
> mtu > DEFAULT_RX_BUFFER_SIZE (1536 in my kernel).  It appears as  
> though
> the mtu is set to 1500.  Under what conditions would the driver  
> attempt
> to send a frame larger than the mtu?

can't think of any.  How about adding a BUG_ON() in the tx path to see  
if the buffer size > MTU and re-run your tests.

- k

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-18 17:29     ` Kumar Gala
@ 2009-02-18 20:45       ` Scott Coulter
  2009-02-19 16:17       ` Scott Coulter
  1 sibling, 0 replies; 16+ messages in thread
From: Scott Coulter @ 2009-02-18 20:45 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev



Kumar,

>=20
> can't think of any.  How about adding a BUG_ON() in the tx path to see
> if the buffer size > MTU and re-run your tests.
>

With the following in gfar_start_xmit():

BUG_ON(skb->len > priv->dev->mtu);

I bug checked during the NFS root boot process with skb->len at 1514 and
priv->dev->mtu at 1500.  I changed it to:

BUG_ON(skb->len > DEFAULT_RX_BUFFER_SIZE);

Compiling now.  Has not failed yet, I'll let you know.

Scott

___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-18 17:29     ` Kumar Gala
  2009-02-18 20:45       ` Scott Coulter
@ 2009-02-19 16:17       ` Scott Coulter
  2009-02-19 16:29         ` Kumar Gala
  2009-02-19 16:48         ` sjoyeau
  1 sibling, 2 replies; 16+ messages in thread
From: Scott Coulter @ 2009-02-19 16:17 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev



Kumar,

>=20
> can't think of any.  How about adding a BUG_ON() in the tx path to see
> if the buffer size > MTU and re-run your tests.
>

So, here are the checks I've tried in gfar_start_xmit():

BUG_ON(skb->len > DEFAULT_RX_BUFFER_SIZE)

BUG_ON(skb->len > priv->regs->maxfrm)

Neither produces a bug check yet ethtool reports non-zero
tx-babbling-errors.  The last check appears to be the definition of
tx-babbling-errors.  Is there a transmit path that I have missed?

Scott

___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Gianfar tx-babbling-errors
  2009-02-19 16:17       ` Scott Coulter
@ 2009-02-19 16:29         ` Kumar Gala
  2009-02-19 16:38           ` Scott Coulter
  2009-02-19 16:48         ` sjoyeau
  1 sibling, 1 reply; 16+ messages in thread
From: Kumar Gala @ 2009-02-19 16:29 UTC (permalink / raw)
  To: Scott Coulter; +Cc: linuxppc-dev


On Feb 19, 2009, at 10:17 AM, Scott Coulter wrote:

>
>
> Kumar,
>
>>
>> can't think of any.  How about adding a BUG_ON() in the tx path to  
>> see
>> if the buffer size > MTU and re-run your tests.
>>
>
> So, here are the checks I've tried in gfar_start_xmit():
>
> BUG_ON(skb->len > DEFAULT_RX_BUFFER_SIZE)
>
> BUG_ON(skb->len > priv->regs->maxfrm)
>
> Neither produces a bug check yet ethtool reports non-zero
> tx-babbling-errors.  The last check appears to be the definition of
> tx-babbling-errors.  Is there a transmit path that I have missed?

What specific processor & rev are you running on?

- k

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-19 16:29         ` Kumar Gala
@ 2009-02-19 16:38           ` Scott Coulter
  0 siblings, 0 replies; 16+ messages in thread
From: Scott Coulter @ 2009-02-19 16:38 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev


>=20
> What specific processor & rev are you running on?
>

I've only been running the modified kernel with the added BUG_ON() code
on the 8568E processor, but I've seen the errors reported on the 8572E
as well.

According to the u-boot startup:

8568E, Version: 1.1, (0x807d0011)
Core:  E500, Version: 2.2 (0x80210022)

8572E, Version: 1.1, (0x80e80011)
Core:  E500, Version: 3.0 (0x80210030)


Scott
___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Gianfar tx-babbling-errors
  2009-02-19 16:17       ` Scott Coulter
  2009-02-19 16:29         ` Kumar Gala
@ 2009-02-19 16:48         ` sjoyeau
  2009-02-19 17:03           ` Kumar Gala
  1 sibling, 1 reply; 16+ messages in thread
From: sjoyeau @ 2009-02-19 16:48 UTC (permalink / raw)
  To: Scott Coulter; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1650 bytes --]

Hi Scott,

Your issue may come from data setup (or corruption) instead of code path:
babbling error may occurs when a TSEC TX descriptor hasn't its "last frame"
bit set or when the data length is greated than max frame length.

--
sj

2009/2/19 Scott Coulter <scott.coulter@cyclone.com>

>
>
> Kumar,
>
> >
> > can't think of any.  How about adding a BUG_ON() in the tx path to see
> > if the buffer size > MTU and re-run your tests.
> >
>
> So, here are the checks I've tried in gfar_start_xmit():
>
> BUG_ON(skb->len > DEFAULT_RX_BUFFER_SIZE)
>
> BUG_ON(skb->len > priv->regs->maxfrm)
>
> Neither produces a bug check yet ethtool reports non-zero
> tx-babbling-errors.  The last check appears to be the definition of
> tx-babbling-errors.  Is there a transmit path that I have missed?
>
> Scott
>
> ___________________________________________________________________
>
>  Scott N. Coulter
>  Senior Software Engineer
>
>  Cyclone Microsystems
>  370 James Street              Phone:  203.786.5536 ext. 118
>  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
>  U.S.A.                        Web:    http://www.cyclone.com
> ___________________________________________________________________
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
>



-- 
------------------
Sylvain JOYEAU
Freelance Engineer
Software RT-OS R&D
sylvain.joyeau@gmail.com
Tél: +33-(0)667 477 052
"A good idea is one side of the coin. The other side is the practical
usefulness". J. Liedke.

[-- Attachment #2: Type: text/html, Size: 2679 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Gianfar tx-babbling-errors
  2009-02-19 16:48         ` sjoyeau
@ 2009-02-19 17:03           ` Kumar Gala
  2009-02-19 17:29             ` Scott Coulter
  0 siblings, 1 reply; 16+ messages in thread
From: Kumar Gala @ 2009-02-19 17:03 UTC (permalink / raw)
  To: sjoyeau; +Cc: linuxppc-dev, Scott Coulter


On Feb 19, 2009, at 10:48 AM, sjoyeau@wanadoo.fr wrote:

> Hi Scott,
>
> Your issue may come from data setup (or corruption) instead of code  
> path: babbling error may occurs when a TSEC TX descriptor hasn't its  
> "last frame" bit set or when the data length is greated than max  
> frame length.
>
> --

Take a look at TxBD[TR] and see if its getting set ever.

- k

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-19 17:03           ` Kumar Gala
@ 2009-02-19 17:29             ` Scott Coulter
  0 siblings, 0 replies; 16+ messages in thread
From: Scott Coulter @ 2009-02-19 17:29 UTC (permalink / raw)
  To: Kumar Gala, sjoyeau; +Cc: linuxppc-dev



> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: February 19, 2009 12:04PM
> >
> > Your issue may come from data setup (or corruption) instead of code
> > path: babbling error may occurs when a TSEC TX descriptor hasn't its
> > "last frame" bit set or when the data length is greated than max
> > frame length.
> >
> > --
>=20
> Take a look at TxBD[TR] and see if its getting set ever.
>=20

I added two bug checks:

- one in gfar_clean_tx_ring() after the check for completed frames:

		/* see if any transmits were truncated */
		BUG_ON(lstatus & BD_LFLAG(TXBD_TR));

- one in gfar_start_xmit() at the end to check a flag to see if
TXBD_LAST never gets set.

Within a couple of minutes it bug checked in gfar_clean_tx_ring():

kernel BUG at drivers/net/gianfar.c:1826!
Oops: Exception in kernel mode, sig: 5 [#1]
CYC833-8568
Modules linked in:
NIP: c019a7b8 LR: c019a824 CTR: 00000000
REGS: c0307d50 TRAP: 0700   Not tainted  (2.6.27.6)
MSR: 00021000 <ME>  CR: 44044044  XER: 20000000
TASK =3D c02eb4a8[0] 'swapper' THREAD: c0306000
GPR00: 00010000 c0307e00 c02eb4a8 00000002 00000010 00000001 c030dc48
c0e003c0
GPR08: c0314e80 180107a8 00000000 efab6dd0 24044022 002ae6bc 0000007f
00000010
GPR16: ef84a800 00029000 ef84abec ef84abc0 00000000 00000001 00000073
000001cc
GPR24: 00100100 00000001 ef84abc0 00000400 efab7400 00000000 ef0f9f00
efab71c8
NIP [c019a7b8] gfar_poll+0xb0/0x408
LR [c019a824] gfar_poll+0x11c/0x408
Call Trace:
[c0307e00] [c019a824] gfar_poll+0x11c/0x408 (unreliable)
[c0307e50] [c01cb144] net_rx_action+0xc4/0x180
[c0307e80] [c0036404] __do_softirq+0x74/0xe0
[c0307ea0] [c0004a20] do_softirq+0x54/0x58
[c0307eb0] [c00362b0] irq_exit+0x94/0x98
[c0307ec0] [c0004acc] do_IRQ+0xa8/0xc8
[c0307ed0] [c000e40c] ret_from_except+0x0/0x18
[c0307f90] [c0007c88] cpu_idle+0x50/0xd8
[c0307fb0] [c024b220] __got2_end+0x58/0x68
[c0307fc0] [c02c5808] start_kernel+0x230/0x2ac
[c0307ff0] [c00003c4] skpinv+0x2ec/0x328
Instruction dump:
3a800000 817e0094 a32b0004 57291838 7d3f4a14 7f89e040 7d7b4850 7d295f1e
81290000 2f890000 419c0160 552003de <0f000000> 813f0000 2f190000
55290084
Kernel panic - not syncing: Fatal exception in interrupt



___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-18 16:16 Gianfar tx-babbling-errors Scott Coulter
  2009-02-18 17:03 ` Kumar Gala
@ 2009-02-20 19:16 ` Haruki Dai-R35557
  2009-02-20 20:59   ` Scott Coulter
  1 sibling, 1 reply; 16+ messages in thread
From: Haruki Dai-R35557 @ 2009-02-20 19:16 UTC (permalink / raw)
  To: Scott Coulter, linuxppc-dev; +Cc: Gala Kumar-B11780

Hi Scott,

 Is this your own board? If so, what PHY chip are you using? Are you
using the PHY driver?
 If the generic PHY driver is used and polling the MDIO periodically for
the link check, you may truncate the packet. I hope this is not the
case.=20

Regards
Dai

> -----Original Message-----
> From: linuxppc-dev-bounces+dai.haruki=3Dfreescale.com@ozlabs.org
> [mailto:linuxppc-dev-bounces+dai.haruki=3Dfreescale.com@ozlabs.org] On
Behalf Of
> Scott Coulter
> Sent: Wednesday, February 18, 2009 10:16 AM
> To: linuxppc-dev@ozlabs.org
> Subject: Gianfar tx-babbling-errors
>=20
>=20
> Hi all,
>=20
> As a simple stress test for my board with an MPC8572E and an MPC8568E
on
> it, I setup both processors to boot linux 2.6.27.6 with an NFS root
and
> then perform repeated native compiles of a linux kernel over NFS.
After
> running for 4 days straight or so with between 250-300 build cycles
per
> processor, I stopped the builds and ran ethtool to look for any odd
> statistics.  Both processors reported non-zero values for
> tx-babbling-errors.  Both processors reported around 1300
> tx-babbling-errors out of about 80,000,000 Tx packets.  Should I be
> concerned about the tx-babbling-errors?  What conditions would cause
> these errors to be reported?
>=20
> Thanks,
> Scott
>=20
>=20
>=20
>=20
>=20
> ___________________________________________________________________
>=20
>   Scott N. Coulter
>   Senior Software Engineer
>=20
>   Cyclone Microsystems
>   370 James Street              Phone:  203.786.5536 ext. 118
>   New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
>   U.S.A.                        Web:    http://www.cyclone.com
> ___________________________________________________________________
>=20
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-20 19:16 ` Haruki Dai-R35557
@ 2009-02-20 20:59   ` Scott Coulter
  2009-02-20 21:32     ` Haruki Dai-R35557
  0 siblings, 1 reply; 16+ messages in thread
From: Scott Coulter @ 2009-02-20 20:59 UTC (permalink / raw)
  To: Haruki Dai-R35557, linuxppc-dev; +Cc: Gala Kumar-B11780




Dai,

>  Is this your own board? If so, what PHY chip are you using? Are you
> using the PHY driver?
>  If the generic PHY driver is used and polling the MDIO periodically
for
> the link check, you may truncate the packet. I hope this is not the
> case.

Yes this is our own board.  Both the 8568E and 8572E processors each
expose two TSECs.  The 4 TSECs are each connected to a separate
interface of a  Broadcom 5464 Quad PHY in RGMII mode.  I am pretty sure
that there is no interrupt connected (I'll have to check with the
hardware engineer) and I know that I didn't configure one in the DTS
file.  My kernel is configured to use the Broadcomm 54xx driver, but I'm
not sure if it polls when no interrupt is configured.

Thanks,
Scott
___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

> -----Original Message-----
> From: Haruki Dai-R35557 [mailto:Dai.Haruki@freescale.com]
> Sent: February 20, 2009 2:16PM
> To: Scott Coulter; linuxppc-dev@ozlabs.org
> Cc: Gala Kumar-B11780
> Subject: RE: Gianfar tx-babbling-errors
>=20
> Hi Scott,
>=20
>=20
> Regards
> Dai
>=20
> > -----Original Message-----
> > From: linuxppc-dev-bounces+dai.haruki=3Dfreescale.com@ozlabs.org
> > [mailto:linuxppc-dev-bounces+dai.haruki=3Dfreescale.com@ozlabs.org] =
On
> Behalf Of
> > Scott Coulter
> > Sent: Wednesday, February 18, 2009 10:16 AM
> > To: linuxppc-dev@ozlabs.org
> > Subject: Gianfar tx-babbling-errors
> >
> >
> > Hi all,
> >
> > As a simple stress test for my board with an MPC8572E and an
MPC8568E
> on
> > it, I setup both processors to boot linux 2.6.27.6 with an NFS root
> and
> > then perform repeated native compiles of a linux kernel over NFS.
> After
> > running for 4 days straight or so with between 250-300 build cycles
> per
> > processor, I stopped the builds and ran ethtool to look for any odd
> > statistics.  Both processors reported non-zero values for
> > tx-babbling-errors.  Both processors reported around 1300
> > tx-babbling-errors out of about 80,000,000 Tx packets.  Should I be
> > concerned about the tx-babbling-errors?  What conditions would cause
> > these errors to be reported?
> >
> > Thanks,
> > Scott
> >
> >
> >
> >
> >
> > ___________________________________________________________________
> >
> >   Scott N. Coulter
> >   Senior Software Engineer
> >
> >   Cyclone Microsystems
> >   370 James Street              Phone:  203.786.5536 ext. 118
> >   New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
> >   U.S.A.                        Web:    http://www.cyclone.com
> > ___________________________________________________________________
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@ozlabs.org
> > https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-20 20:59   ` Scott Coulter
@ 2009-02-20 21:32     ` Haruki Dai-R35557
  2009-02-24 16:58       ` Scott Coulter
  2009-03-03 16:33       ` Scott Coulter
  0 siblings, 2 replies; 16+ messages in thread
From: Haruki Dai-R35557 @ 2009-02-20 21:32 UTC (permalink / raw)
  To: Scott Coulter, linuxppc-dev; +Cc: Gala Kumar-B11780

Scott,
 I am not so sure about your PHY, but if you access to PHY while packet
transmission through MDIO bus, the packet might be corrupted. Do you
have "phy_interrupt" in the /proc/interrupts? What is your dmesg around
the eTSEC look like (there is phy driver info surrounded).

Regards
Dai



> -----Original Message-----
> From: Scott Coulter [mailto:scott.coulter@cyclone.com]
> Sent: Friday, February 20, 2009 3:00 PM
> To: Haruki Dai-R35557; linuxppc-dev@ozlabs.org
> Cc: Gala Kumar-B11780
> Subject: RE: Gianfar tx-babbling-errors
>=20
>=20
>=20
>=20
> Dai,
>=20
> >  Is this your own board? If so, what PHY chip are you using? Are you
> > using the PHY driver?
> >  If the generic PHY driver is used and polling the MDIO periodically
> for
> > the link check, you may truncate the packet. I hope this is not the
> > case.
>=20
> Yes this is our own board.  Both the 8568E and 8572E processors each
> expose two TSECs.  The 4 TSECs are each connected to a separate
> interface of a  Broadcom 5464 Quad PHY in RGMII mode.  I am pretty
sure
> that there is no interrupt connected (I'll have to check with the
> hardware engineer) and I know that I didn't configure one in the DTS
> file.  My kernel is configured to use the Broadcomm 54xx driver, but
I'm
> not sure if it polls when no interrupt is configured.
>=20
> Thanks,
> Scott
> ___________________________________________________________________
>=20
>   Scott N. Coulter
>   Senior Software Engineer
>=20
>   Cyclone Microsystems
>   370 James Street              Phone:  203.786.5536 ext. 118
>   New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
>   U.S.A.                        Web:    http://www.cyclone.com
> ___________________________________________________________________
>=20
> > -----Original Message-----
> > From: Haruki Dai-R35557 [mailto:Dai.Haruki@freescale.com]
> > Sent: February 20, 2009 2:16PM
> > To: Scott Coulter; linuxppc-dev@ozlabs.org
> > Cc: Gala Kumar-B11780
> > Subject: RE: Gianfar tx-babbling-errors
> >
> > Hi Scott,
> >
> >
> > Regards
> > Dai
> >
> > > -----Original Message-----
> > > From: linuxppc-dev-bounces+dai.haruki=3Dfreescale.com@ozlabs.org
> > > =
[mailto:linuxppc-dev-bounces+dai.haruki=3Dfreescale.com@ozlabs.org]
On
> > Behalf Of
> > > Scott Coulter
> > > Sent: Wednesday, February 18, 2009 10:16 AM
> > > To: linuxppc-dev@ozlabs.org
> > > Subject: Gianfar tx-babbling-errors
> > >
> > >
> > > Hi all,
> > >
> > > As a simple stress test for my board with an MPC8572E and an
> MPC8568E
> > on
> > > it, I setup both processors to boot linux 2.6.27.6 with an NFS
root
> > and
> > > then perform repeated native compiles of a linux kernel over NFS.
> > After
> > > running for 4 days straight or so with between 250-300 build
cycles
> > per
> > > processor, I stopped the builds and ran ethtool to look for any
odd
> > > statistics.  Both processors reported non-zero values for
> > > tx-babbling-errors.  Both processors reported around 1300
> > > tx-babbling-errors out of about 80,000,000 Tx packets.  Should I
be
> > > concerned about the tx-babbling-errors?  What conditions would
cause
> > > these errors to be reported?
> > >
> > > Thanks,
> > > Scott
> > >
> > >
> > >
> > >
> > >
> > >
___________________________________________________________________
> > >
> > >   Scott N. Coulter
> > >   Senior Software Engineer
> > >
> > >   Cyclone Microsystems
> > >   370 James Street              Phone:  203.786.5536 ext. 118
> > >   New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
> > >   U.S.A.                        Web:    http://www.cyclone.com
> > >
___________________________________________________________________
> > >
> > > _______________________________________________
> > > Linuxppc-dev mailing list
> > > Linuxppc-dev@ozlabs.org
> > > https://ozlabs.org/mailman/listinfo/linuxppc-dev
>=20

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-20 21:32     ` Haruki Dai-R35557
@ 2009-02-24 16:58       ` Scott Coulter
  2009-03-03 16:33       ` Scott Coulter
  1 sibling, 0 replies; 16+ messages in thread
From: Scott Coulter @ 2009-02-24 16:58 UTC (permalink / raw)
  To: Haruki Dai-R35557, linuxppc-dev; +Cc: Gala Kumar-B11780


Dai,

>  I am not so sure about your PHY, but if you access to PHY while
packet
> transmission through MDIO bus, the packet might be corrupted. Do you
> have "phy_interrupt" in the /proc/interrupts? What is your dmesg
around
> the eTSEC look like (there is phy driver info surrounded).

With some printks, I did confirm that the kernel is polling the phy
every second or so.  So, I modified the routine which reads the phy
status to hardcode the link state, speed, and duplex without actually
performing any reads and the driver still bug checks with a truncated
packet.  I also dropped a printk in phy_read() and phy_write(), but
neither function is getting called with the hard-coded link information.

In gfar_clean_tx_ring() I printed out the tx descriptor data length
field when I detected that the descriptor had the TR bit set.  For 5 bug
checks in a row, the data length field was set to 1960 (which would
definitely cause the packet truncation).  I then went back into
gfar_start_xmit() and added some additional BUG_ON() checks of the data
length field before the descriptor is handed off to the TSEC and none of
them fire.  I am wondering if somehow the descriptor is getting
corrupted, perhaps by some code with an errant pointer.

Scott=20



___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: Gianfar tx-babbling-errors
  2009-02-20 21:32     ` Haruki Dai-R35557
  2009-02-24 16:58       ` Scott Coulter
@ 2009-03-03 16:33       ` Scott Coulter
  1 sibling, 0 replies; 16+ messages in thread
From: Scott Coulter @ 2009-03-03 16:33 UTC (permalink / raw)
  To: Haruki Dai-R35557, linuxppc-dev; +Cc: Gala Kumar-B11780


Hi all,

In my continued search to stop tx-babbling-errors, I grabbed the latest
blobs for gianfar.c and gianfar.h from http://git.kernel.org.  The
history says that the following changes were made since the driver
version I was using (Freescale December 2008 LTIB release for 8572):

2009-02-09 Jarek Poplawski gianfar: Fix boot hangs while bringing up
gianfar ethernet
2009-02-05 Andy Fleming gianfar: Fix potential soft reset race=20
2009-01-26 Anton Vorontsov gianfar: Revive VLAN support blob=20
2009-01-13 Anton Vorontsov gianfar: Fix soft lockup with multi-interrupt
TSECs=20
2009-01-11 Clifford Wolf netdev: gianfar: add MII ioctl handler=20
2009-01-08 Kumar Gala gianfar: Fixup use of BUS_ID_SIZE
2009-01-06 Li Yang gianfar: ensure ECNTRL[R100] is cleared on link state
change
2008-12-18 Andy Fleming gianfar: Continue polling until both tx and rx
are empty

I backed out the following change due to compile errors:

2008-12-23 Neil Horman net: Remove unused netdev arg from some NAPI
interfaces

I also added the following change which wasn't in the kernel.org git:

2009-02-26 Rini van Zetten: fix to prevent num_txbdfree from going
negative

I also left my BUG_ON() check in gfar_clean_tx_ring() in to look for
truncated packets.  So far I've done 3 complete kernel builds over NFS
with no tx-babbling-errors reported by ethtool and no bug checks.

Scott




___________________________________________________________________

  Scott N. Coulter
  Senior Software Engineer
 =20
  Cyclone Microsystems         =20
  370 James Street              Phone:  203.786.5536 ext. 118
  New Haven, CT 06513-3051      Email:  scott.coulter@cyclone.com
  U.S.A.                        Web:    http://www.cyclone.com
___________________________________________________________________

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-03-03 16:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-18 16:16 Gianfar tx-babbling-errors Scott Coulter
2009-02-18 17:03 ` Kumar Gala
2009-02-18 17:22   ` Scott Coulter
2009-02-18 17:29     ` Kumar Gala
2009-02-18 20:45       ` Scott Coulter
2009-02-19 16:17       ` Scott Coulter
2009-02-19 16:29         ` Kumar Gala
2009-02-19 16:38           ` Scott Coulter
2009-02-19 16:48         ` sjoyeau
2009-02-19 17:03           ` Kumar Gala
2009-02-19 17:29             ` Scott Coulter
2009-02-20 19:16 ` Haruki Dai-R35557
2009-02-20 20:59   ` Scott Coulter
2009-02-20 21:32     ` Haruki Dai-R35557
2009-02-24 16:58       ` Scott Coulter
2009-03-03 16:33       ` Scott Coulter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).