All of lore.kernel.org
 help / color / mirror / Atom feed
* Latitude 5495's tg3 hangs under heavy load
@ 2018-12-07  9:27 Kai Heng Feng
  2019-01-07  6:04 ` Kai Heng Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Kai Heng Feng @ 2018-12-07  9:27 UTC (permalink / raw)
  To: Siva Reddy Kallam, Prashant Sreedharan, Michael Chan
  Cc: Linux Netdev List, Chih-Hsyuan Ho

Hi tg3 maintainers,

I’ve encountered network freeze when using tg3 in gigabits net.

The issue can be easily reproduced when using scp to transfer files in local network.

The symptom is pretty similar to what this commit is trying to solve:
commit 3a498606bb04af603a46ebde8296040b2de350d1
Author: Sanjeev Bansal <sanjeevb.bansal@broadcom.com>
Date:   Mon Jul 16 11:13:32 2018 +0530

    tg3: Add higher cpu clock for 5762.
    
    This patch has fix for TX timeout while running bi-directional
    traffic with 100 Mbps using 5762.

But reverting this commit doesn’t help.

Latitude 5495 is a AMD Raven Ridge platform, not sure if this matters.

Here’s the lspci for this device:
03:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5762 Gigabit Ethernet PCIe [14e4:1687] (rev 10)
        Subsystem: Dell NetXtreme BCM5762 Gigabit Ethernet PCIe [1028:0814]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at e0220000 (64-bit, prefetchable) [size=64K]
        Region 2: Memory at e0210000 (64-bit, prefetchable) [size=64K]
        Region 4: Memory at e0200000 (64-bit, prefetchable) [size=64K]
        Capabilities: [48] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] Vital Product Data
                Product Name: Broadcom NetXtreme Gigabit Ethernet Controller
                Read-only fields:
                        [PN] Part number: BCM95762
                        [EC] Engineering changes: 106679-15
                        [SN] Serial number: 0123456789
                        [MN] Manufacture ID: 31 34 65 34
                        [RV] Reserved: checksum good, 28 byte(s) reserved
                Read/write fields:
                        [YA] Asset tag: XYZ01234567
                        [RW] Read-write area: 107 byte(s) free
                End
        Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [a0] MSI-X: Enable+ Count=6 Masked-
                Vector table: BAR=4 offset=00000000
                PBA: BAR=2 offset=00000120
        Capabilities: [ac] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
                        MaxPayload 128 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [13c v1] Device Serial Number 00-00-a4-4c-c8-5b-65-74
        Capabilities: [150 v1] Power Budgeting <?>
        Capabilities: [160 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                        Status: NegoPending- InProgress-
        Capabilities: [1b0 v1] Latency Tolerance Reporting
                Max snoop latency: 1048576ns
                Max no snoop latency: 1048576ns
        Capabilities: [230 v1] Transaction Processing Hints
                Interrupt vector mode supported
                Steering table in MSI-X table
        Kernel driver in use: tg3
        Kernel modules: tg3

Please let me know if you need more information.

Kai-Heng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2018-12-07  9:27 Latitude 5495's tg3 hangs under heavy load Kai Heng Feng
@ 2019-01-07  6:04 ` Kai Heng Feng
  2019-01-07  9:12   ` Siva Reddy Kallam
  0 siblings, 1 reply; 8+ messages in thread
From: Kai Heng Feng @ 2019-01-07  6:04 UTC (permalink / raw)
  To: Siva Reddy Kallam, Prashant Sreedharan, Michael Chan
  Cc: Linux Netdev List, Chih-Hsyuan Ho

Hi tg3 folks,

Any idea how to solve the bug?

Kai-Heng

> On Dec 7, 2018, at 17:27, Kai Heng Feng <kai.heng.feng@canonical.com> wrote:
> 
> Hi tg3 maintainers,
> 
> I’ve encountered network freeze when using tg3 in gigabits net.
> 
> The issue can be easily reproduced when using scp to transfer files in local network.
> 
> The symptom is pretty similar to what this commit is trying to solve:
> commit 3a498606bb04af603a46ebde8296040b2de350d1
> Author: Sanjeev Bansal <sanjeevb.bansal@broadcom.com>
> Date:   Mon Jul 16 11:13:32 2018 +0530
> 
>    tg3: Add higher cpu clock for 5762.
> 
>    This patch has fix for TX timeout while running bi-directional
>    traffic with 100 Mbps using 5762.
> 
> But reverting this commit doesn’t help.
> 
> Latitude 5495 is a AMD Raven Ridge platform, not sure if this matters.
> 
> Here’s the lspci for this device:
> 03:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5762 Gigabit Ethernet PCIe [14e4:1687] (rev 10)
>        Subsystem: Dell NetXtreme BCM5762 Gigabit Ethernet PCIe [1028:0814]
>        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>        Latency: 0
>        Interrupt: pin A routed to IRQ 16
>        Region 0: Memory at e0220000 (64-bit, prefetchable) [size=64K]
>        Region 2: Memory at e0210000 (64-bit, prefetchable) [size=64K]
>        Region 4: Memory at e0200000 (64-bit, prefetchable) [size=64K]
>        Capabilities: [48] Power Management version 3
>                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
>        Capabilities: [50] Vital Product Data
>                Product Name: Broadcom NetXtreme Gigabit Ethernet Controller
>                Read-only fields:
>                        [PN] Part number: BCM95762
>                        [EC] Engineering changes: 106679-15
>                        [SN] Serial number: 0123456789
>                        [MN] Manufacture ID: 31 34 65 34
>                        [RV] Reserved: checksum good, 28 byte(s) reserved
>                Read/write fields:
>                        [YA] Asset tag: XYZ01234567
>                        [RW] Read-write area: 107 byte(s) free
>                End
>        Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
>                Address: 0000000000000000  Data: 0000
>        Capabilities: [a0] MSI-X: Enable+ Count=6 Masked-
>                Vector table: BAR=4 offset=00000000
>                PBA: BAR=2 offset=00000120
>        Capabilities: [ac] Express (v2) Endpoint, MSI 00
>                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
>                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
>                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>                        RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
>                        MaxPayload 128 bytes, MaxReadReq 4096 bytes
>                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
>                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us
>                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>                LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
>                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
>                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                         Compliance De-emphasis: -6dB
>                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>        Capabilities: [100 v1] Advanced Error Reporting
>                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>        Capabilities: [13c v1] Device Serial Number 00-00-a4-4c-c8-5b-65-74
>        Capabilities: [150 v1] Power Budgeting <?>
>        Capabilities: [160 v1] Virtual Channel
>                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>                Arb:    Fixed- WRR32- WRR64- WRR128-
>                Ctrl:   ArbSelect=Fixed
>                Status: InProgress-
>                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
>                        Status: NegoPending- InProgress-
>        Capabilities: [1b0 v1] Latency Tolerance Reporting
>                Max snoop latency: 1048576ns
>                Max no snoop latency: 1048576ns
>        Capabilities: [230 v1] Transaction Processing Hints
>                Interrupt vector mode supported
>                Steering table in MSI-X table
>        Kernel driver in use: tg3
>        Kernel modules: tg3
> 
> Please let me know if you need more information.
> 
> Kai-Heng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2019-01-07  6:04 ` Kai Heng Feng
@ 2019-01-07  9:12   ` Siva Reddy Kallam
  2019-01-09  5:42     ` Kai Heng Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Siva Reddy Kallam @ 2019-01-07  9:12 UTC (permalink / raw)
  To: Kai Heng Feng
  Cc: Prashant Sreedharan, Michael Chan, Linux Netdev List, Chih-Hsyuan Ho

On Mon, Jan 7, 2019 at 11:34 AM Kai Heng Feng
<kai.heng.feng@canonical.com> wrote:
>
> Hi tg3 folks,
>
> Any idea how to solve the bug?
>
> Kai-Heng
Hi,
Can you share Register dump(ethtool -d)?
Is ifconfig down/up bringing back interface?

>
> > On Dec 7, 2018, at 17:27, Kai Heng Feng <kai.heng.feng@canonical.com> wrote:
> >
> > Hi tg3 maintainers,
> >
> > I’ve encountered network freeze when using tg3 in gigabits net.
> >
> > The issue can be easily reproduced when using scp to transfer files in local network.
> >
> > The symptom is pretty similar to what this commit is trying to solve:
> > commit 3a498606bb04af603a46ebde8296040b2de350d1
> > Author: Sanjeev Bansal <sanjeevb.bansal@broadcom.com>
> > Date:   Mon Jul 16 11:13:32 2018 +0530
> >
> >    tg3: Add higher cpu clock for 5762.
> >
> >    This patch has fix for TX timeout while running bi-directional
> >    traffic with 100 Mbps using 5762.
> >
> > But reverting this commit doesn’t help.
> >
> > Latitude 5495 is a AMD Raven Ridge platform, not sure if this matters.
> >
> > Here’s the lspci for this device:
> > 03:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5762 Gigabit Ethernet PCIe [14e4:1687] (rev 10)
> >        Subsystem: Dell NetXtreme BCM5762 Gigabit Ethernet PCIe [1028:0814]
> >        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> >        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> >        Latency: 0
> >        Interrupt: pin A routed to IRQ 16
> >        Region 0: Memory at e0220000 (64-bit, prefetchable) [size=64K]
> >        Region 2: Memory at e0210000 (64-bit, prefetchable) [size=64K]
> >        Region 4: Memory at e0200000 (64-bit, prefetchable) [size=64K]
> >        Capabilities: [48] Power Management version 3
> >                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> >                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
> >        Capabilities: [50] Vital Product Data
> >                Product Name: Broadcom NetXtreme Gigabit Ethernet Controller
> >                Read-only fields:
> >                        [PN] Part number: BCM95762
> >                        [EC] Engineering changes: 106679-15
> >                        [SN] Serial number: 0123456789
> >                        [MN] Manufacture ID: 31 34 65 34
> >                        [RV] Reserved: checksum good, 28 byte(s) reserved
> >                Read/write fields:
> >                        [YA] Asset tag: XYZ01234567
> >                        [RW] Read-write area: 107 byte(s) free
> >                End
> >        Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
> >                Address: 0000000000000000  Data: 0000
> >        Capabilities: [a0] MSI-X: Enable+ Count=6 Masked-
> >                Vector table: BAR=4 offset=00000000
> >                PBA: BAR=2 offset=00000120
> >        Capabilities: [ac] Express (v2) Endpoint, MSI 00
> >                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
> >                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
> >                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> >                        RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
> >                        MaxPayload 128 bytes, MaxReadReq 4096 bytes
> >                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
> >                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us
> >                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> >                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
> >                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> >                LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> >                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
> >                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
> >                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> >                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> >                         Compliance De-emphasis: -6dB
> >                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> >                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >        Capabilities: [100 v1] Advanced Error Reporting
> >                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> >                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> >                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> >        Capabilities: [13c v1] Device Serial Number 00-00-a4-4c-c8-5b-65-74
> >        Capabilities: [150 v1] Power Budgeting <?>
> >        Capabilities: [160 v1] Virtual Channel
> >                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
> >                Arb:    Fixed- WRR32- WRR64- WRR128-
> >                Ctrl:   ArbSelect=Fixed
> >                Status: InProgress-
> >                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> >                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> >                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
> >                        Status: NegoPending- InProgress-
> >        Capabilities: [1b0 v1] Latency Tolerance Reporting
> >                Max snoop latency: 1048576ns
> >                Max no snoop latency: 1048576ns
> >        Capabilities: [230 v1] Transaction Processing Hints
> >                Interrupt vector mode supported
> >                Steering table in MSI-X table
> >        Kernel driver in use: tg3
> >        Kernel modules: tg3
> >
> > Please let me know if you need more information.
> >
> > Kai-Heng
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2019-01-07  9:12   ` Siva Reddy Kallam
@ 2019-01-09  5:42     ` Kai Heng Feng
  2019-01-31  8:08       ` Kai-Heng Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Kai Heng Feng @ 2019-01-09  5:42 UTC (permalink / raw)
  To: Siva Reddy Kallam
  Cc: Prashant Sreedharan, Michael Chan, Linux Netdev List, Chih-Hsyuan Ho



> On Jan 7, 2019, at 5:12 PM, Siva Reddy Kallam <siva.kallam@broadcom.com> wrote:
> 
> On Mon, Jan 7, 2019 at 11:34 AM Kai Heng Feng
> <kai.heng.feng@canonical.com> wrote:
>> 
>> Hi tg3 folks,
>> 
>> Any idea how to solve the bug?
>> 
>> Kai-Heng
> Hi,
> Can you share Register dump(ethtool -d)?

`ethtool -d` before freeze:
https://pastebin.com/MSkJzhcv

`ethtool -d` after freeze:
https://pastebin.com/dQj8mLsN

> Is ifconfig down/up bringing back interface?

Yes. And seems like network works fine afterward.

`ethtool -d` after down/up:
https://pastebin.com/vL1gCC2n

Kai-Heng

> 
>> 
>>> On Dec 7, 2018, at 17:27, Kai Heng Feng <kai.heng.feng@canonical.com> wrote:
>>> 
>>> Hi tg3 maintainers,
>>> 
>>> I’ve encountered network freeze when using tg3 in gigabits net.
>>> 
>>> The issue can be easily reproduced when using scp to transfer files in local network.
>>> 
>>> The symptom is pretty similar to what this commit is trying to solve:
>>> commit 3a498606bb04af603a46ebde8296040b2de350d1
>>> Author: Sanjeev Bansal <sanjeevb.bansal@broadcom.com>
>>> Date:   Mon Jul 16 11:13:32 2018 +0530
>>> 
>>>   tg3: Add higher cpu clock for 5762.
>>> 
>>>   This patch has fix for TX timeout while running bi-directional
>>>   traffic with 100 Mbps using 5762.
>>> 
>>> But reverting this commit doesn’t help.
>>> 
>>> Latitude 5495 is a AMD Raven Ridge platform, not sure if this matters.
>>> 
>>> Here’s the lspci for this device:
>>> 03:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5762 Gigabit Ethernet PCIe [14e4:1687] (rev 10)
>>>       Subsystem: Dell NetXtreme BCM5762 Gigabit Ethernet PCIe [1028:0814]
>>>       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>       Latency: 0
>>>       Interrupt: pin A routed to IRQ 16
>>>       Region 0: Memory at e0220000 (64-bit, prefetchable) [size=64K]
>>>       Region 2: Memory at e0210000 (64-bit, prefetchable) [size=64K]
>>>       Region 4: Memory at e0200000 (64-bit, prefetchable) [size=64K]
>>>       Capabilities: [48] Power Management version 3
>>>               Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>>               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
>>>       Capabilities: [50] Vital Product Data
>>>               Product Name: Broadcom NetXtreme Gigabit Ethernet Controller
>>>               Read-only fields:
>>>                       [PN] Part number: BCM95762
>>>                       [EC] Engineering changes: 106679-15
>>>                       [SN] Serial number: 0123456789
>>>                       [MN] Manufacture ID: 31 34 65 34
>>>                       [RV] Reserved: checksum good, 28 byte(s) reserved
>>>               Read/write fields:
>>>                       [YA] Asset tag: XYZ01234567
>>>                       [RW] Read-write area: 107 byte(s) free
>>>               End
>>>       Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
>>>               Address: 0000000000000000  Data: 0000
>>>       Capabilities: [a0] MSI-X: Enable+ Count=6 Masked-
>>>               Vector table: BAR=4 offset=00000000
>>>               PBA: BAR=2 offset=00000120
>>>       Capabilities: [ac] Express (v2) Endpoint, MSI 00
>>>               DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
>>>                       ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
>>>               DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>>                       RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
>>>                       MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>               DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
>>>               LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us
>>>                       ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>               LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>>>                       ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>               LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>               DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
>>>               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
>>>               LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>>                        Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>>>                        Compliance De-emphasis: -6dB
>>>               LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>>>                        EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>       Capabilities: [100 v1] Advanced Error Reporting
>>>               UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>               UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>               UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>>>               CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>>               AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>>>       Capabilities: [13c v1] Device Serial Number 00-00-a4-4c-c8-5b-65-74
>>>       Capabilities: [150 v1] Power Budgeting <?>
>>>       Capabilities: [160 v1] Virtual Channel
>>>               Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>>>               Arb:    Fixed- WRR32- WRR64- WRR128-
>>>               Ctrl:   ArbSelect=Fixed
>>>               Status: InProgress-
>>>               VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>>                       Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>>                       Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
>>>                       Status: NegoPending- InProgress-
>>>       Capabilities: [1b0 v1] Latency Tolerance Reporting
>>>               Max snoop latency: 1048576ns
>>>               Max no snoop latency: 1048576ns
>>>       Capabilities: [230 v1] Transaction Processing Hints
>>>               Interrupt vector mode supported
>>>               Steering table in MSI-X table
>>>       Kernel driver in use: tg3
>>>       Kernel modules: tg3
>>> 
>>> Please let me know if you need more information.
>>> 
>>> Kai-Heng
>> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2019-01-09  5:42     ` Kai Heng Feng
@ 2019-01-31  8:08       ` Kai-Heng Feng
  2019-03-11  3:53         ` Kai-Heng Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Kai-Heng Feng @ 2019-01-31  8:08 UTC (permalink / raw)
  To: Siva Reddy Kallam
  Cc: Prashant Sreedharan, Michael Chan, Linux Netdev List, Chih-Hsyuan Ho

Hi tg3 folks, 

> On Jan 9, 2019, at 13:42, Kai Heng Feng <kai.heng.feng@canonical.com> wrote:
> 
> 
> 
>> On Jan 7, 2019, at 5:12 PM, Siva Reddy Kallam <siva.kallam@broadcom.com> wrote:
>> 
>> On Mon, Jan 7, 2019 at 11:34 AM Kai Heng Feng
>> <kai.heng.feng@canonical.com> wrote:
>>> 
>>> Hi tg3 folks,
>>> 
>>> Any idea how to solve the bug?
>>> 
>>> Kai-Heng
>> Hi,
>> Can you share Register dump(ethtool -d)?
> 
> `ethtool -d` before freeze:
> https://pastebin.com/MSkJzhcv
> 
> `ethtool -d` after freeze:
> https://pastebin.com/dQj8mLsN
> 
>> Is ifconfig down/up bringing back interface?
> 
> Yes. And seems like network works fine afterward.
> 
> `ethtool -d` after down/up:
> https://pastebin.com/vL1gCC2n

Wondering if there’s any update?

Are these info sufficient?

Kai-Heng

> 
> Kai-Heng
> 
>> 
>>> 
>>>> On Dec 7, 2018, at 17:27, Kai Heng Feng <kai.heng.feng@canonical.com> wrote:
>>>> 
>>>> Hi tg3 maintainers,
>>>> 
>>>> I’ve encountered network freeze when using tg3 in gigabits net.
>>>> 
>>>> The issue can be easily reproduced when using scp to transfer files in local network.
>>>> 
>>>> The symptom is pretty similar to what this commit is trying to solve:
>>>> commit 3a498606bb04af603a46ebde8296040b2de350d1
>>>> Author: Sanjeev Bansal <sanjeevb.bansal@broadcom.com>
>>>> Date:   Mon Jul 16 11:13:32 2018 +0530
>>>> 
>>>>  tg3: Add higher cpu clock for 5762.
>>>> 
>>>>  This patch has fix for TX timeout while running bi-directional
>>>>  traffic with 100 Mbps using 5762.
>>>> 
>>>> But reverting this commit doesn’t help.
>>>> 
>>>> Latitude 5495 is a AMD Raven Ridge platform, not sure if this matters.
>>>> 
>>>> Here’s the lspci for this device:
>>>> 03:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5762 Gigabit Ethernet PCIe [14e4:1687] (rev 10)
>>>>      Subsystem: Dell NetXtreme BCM5762 Gigabit Ethernet PCIe [1028:0814]
>>>>      Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>>      Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>      Latency: 0
>>>>      Interrupt: pin A routed to IRQ 16
>>>>      Region 0: Memory at e0220000 (64-bit, prefetchable) [size=64K]
>>>>      Region 2: Memory at e0210000 (64-bit, prefetchable) [size=64K]
>>>>      Region 4: Memory at e0200000 (64-bit, prefetchable) [size=64K]
>>>>      Capabilities: [48] Power Management version 3
>>>>              Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>>>              Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
>>>>      Capabilities: [50] Vital Product Data
>>>>              Product Name: Broadcom NetXtreme Gigabit Ethernet Controller
>>>>              Read-only fields:
>>>>                      [PN] Part number: BCM95762
>>>>                      [EC] Engineering changes: 106679-15
>>>>                      [SN] Serial number: 0123456789
>>>>                      [MN] Manufacture ID: 31 34 65 34
>>>>                      [RV] Reserved: checksum good, 28 byte(s) reserved
>>>>              Read/write fields:
>>>>                      [YA] Asset tag: XYZ01234567
>>>>                      [RW] Read-write area: 107 byte(s) free
>>>>              End
>>>>      Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
>>>>              Address: 0000000000000000  Data: 0000
>>>>      Capabilities: [a0] MSI-X: Enable+ Count=6 Masked-
>>>>              Vector table: BAR=4 offset=00000000
>>>>              PBA: BAR=2 offset=00000120
>>>>      Capabilities: [ac] Express (v2) Endpoint, MSI 00
>>>>              DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
>>>>                      ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
>>>>              DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>>>                      RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
>>>>                      MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>>              DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
>>>>              LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us
>>>>                      ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>>              LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>>>>                      ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>>              LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>>              DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
>>>>              DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
>>>>              LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>>>                       Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>>>>                       Compliance De-emphasis: -6dB
>>>>              LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
>>>>                       EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>>      Capabilities: [100 v1] Advanced Error Reporting
>>>>              UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>              UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>              UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>>              CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>>>>              CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>>>              AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>>>>      Capabilities: [13c v1] Device Serial Number 00-00-a4-4c-c8-5b-65-74
>>>>      Capabilities: [150 v1] Power Budgeting <?>
>>>>      Capabilities: [160 v1] Virtual Channel
>>>>              Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>>>>              Arb:    Fixed- WRR32- WRR64- WRR128-
>>>>              Ctrl:   ArbSelect=Fixed
>>>>              Status: InProgress-
>>>>              VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>>>                      Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>>>                      Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
>>>>                      Status: NegoPending- InProgress-
>>>>      Capabilities: [1b0 v1] Latency Tolerance Reporting
>>>>              Max snoop latency: 1048576ns
>>>>              Max no snoop latency: 1048576ns
>>>>      Capabilities: [230 v1] Transaction Processing Hints
>>>>              Interrupt vector mode supported
>>>>              Steering table in MSI-X table
>>>>      Kernel driver in use: tg3
>>>>      Kernel modules: tg3
>>>> 
>>>> Please let me know if you need more information.
>>>> 
>>>> Kai-Heng
>>> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2019-01-31  8:08       ` Kai-Heng Feng
@ 2019-03-11  3:53         ` Kai-Heng Feng
  2019-03-11 13:42           ` Siva Reddy Kallam
  0 siblings, 1 reply; 8+ messages in thread
From: Kai-Heng Feng @ 2019-03-11  3:53 UTC (permalink / raw)
  To: Siva Reddy Kallam
  Cc: Prashant Sreedharan, Michael Chan, Linux Netdev List, Chih-Hsyuan Ho

[snipped]

Hi again,

Any update?

Kai-Heng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2019-03-11  3:53         ` Kai-Heng Feng
@ 2019-03-11 13:42           ` Siva Reddy Kallam
  2019-05-20  5:38             ` Kai-Heng Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Siva Reddy Kallam @ 2019-03-11 13:42 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Prashant Sreedharan, Michael Chan, Linux Netdev List, Chih-Hsyuan Ho

On Mon, Mar 11, 2019 at 9:23 AM Kai-Heng Feng
<kai.heng.feng@canonical.com> wrote:
>
> [snipped]
>
> Hi again,
>
> Any update?
>
> Kai-Heng
Sorry for the late reply. We will provide our feedback soon.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Latitude 5495's tg3 hangs under heavy load
  2019-03-11 13:42           ` Siva Reddy Kallam
@ 2019-05-20  5:38             ` Kai-Heng Feng
  0 siblings, 0 replies; 8+ messages in thread
From: Kai-Heng Feng @ 2019-05-20  5:38 UTC (permalink / raw)
  To: Siva Reddy Kallam
  Cc: Prashant Sreedharan, Michael Chan, Linux Netdev List, Chih-Hsyuan Ho

Hi Siva,

at 21:42, Siva Reddy Kallam <siva.kallam@broadcom.com> wrote:

> On Mon, Mar 11, 2019 at 9:23 AM Kai-Heng Feng
> <kai.heng.feng@canonical.com> wrote:
>> [snipped]
>>
>> Hi again,
>>
>> Any update?
>>
>> Kai-Heng
> Sorry for the late reply. We will provide our feedback soon.

Any good news? It still happens on latest mainline kernel.

Kai-Heng

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-20  5:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-07  9:27 Latitude 5495's tg3 hangs under heavy load Kai Heng Feng
2019-01-07  6:04 ` Kai Heng Feng
2019-01-07  9:12   ` Siva Reddy Kallam
2019-01-09  5:42     ` Kai Heng Feng
2019-01-31  8:08       ` Kai-Heng Feng
2019-03-11  3:53         ` Kai-Heng Feng
2019-03-11 13:42           ` Siva Reddy Kallam
2019-05-20  5:38             ` Kai-Heng Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.