All of lore.kernel.org
 help / color / mirror / Atom feed
* ixgbe hangs when XDP_TX is enabled
@ 2018-08-20 19:31 Nikita V. Shirokov
  2018-08-21 15:58 ` Alexander Duyck
  0 siblings, 1 reply; 6+ messages in thread
From: Nikita V. Shirokov @ 2018-08-20 19:31 UTC (permalink / raw)
  To: netdev; +Cc: alexander.h.duyck, jeffrey.t.kirsher

we are getting such errors:

[  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
                 Tx Queue             <46>
                 TDH, TDT             <0>, <2>
                 next_to_use          <2>
                 next_to_clean        <0>
               tx_buffer_info[next_to_clean]
                 time_stamp           <0>
                 jiffies              <1000197c0>
[  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
[  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
[  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
[  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
[  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
[  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX

while running XDP prog on ixgbe nic.
right now i'm seing this on bpfnext kernel 
(latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
9a76aba02a37718242d7cdc294f0a3901928aa57)

looks like this is the same issue as reported by Brenden in
https://www.spinics.net/lists/netdev/msg439438.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ixgbe hangs when XDP_TX is enabled
  2018-08-20 19:31 ixgbe hangs when XDP_TX is enabled Nikita V. Shirokov
@ 2018-08-21 15:58 ` Alexander Duyck
  2018-08-21 16:58   ` Nikita V. Shirokov
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2018-08-21 15:58 UTC (permalink / raw)
  To: tehnerd; +Cc: Netdev, Duyck, Alexander H, Jeff Kirsher

On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
>
> we are getting such errors:
>
> [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
>                  Tx Queue             <46>
>                  TDH, TDT             <0>, <2>
>                  next_to_use          <2>
>                  next_to_clean        <0>
>                tx_buffer_info[next_to_clean]
>                  time_stamp           <0>
>                  jiffies              <1000197c0>
> [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
>
> while running XDP prog on ixgbe nic.
> right now i'm seing this on bpfnext kernel
> (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> 9a76aba02a37718242d7cdc294f0a3901928aa57)
>
> looks like this is the same issue as reported by Brenden in
> https://www.spinics.net/lists/netdev/msg439438.html
>
> --
> Nikita V. Shirokov

Could you provide some additional information about your setup.
Specifically useful would be "ethtool -i", "ethtool -l", and lspci
-vvv info for your device. The total number of CPUs on the system
would be useful to know as well. In addition could you try reproducing
the issue with one of the sample XDP programs provided with the kernel
such as the xdp2 which I believe uses the XDP_TX function. We need to
try and create a similar setup in our own environment for reproduction
and debugging.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ixgbe hangs when XDP_TX is enabled
  2018-08-21 15:58 ` Alexander Duyck
@ 2018-08-21 16:58   ` Nikita V. Shirokov
  2018-08-21 18:13     ` Alexander Duyck
  0 siblings, 1 reply; 6+ messages in thread
From: Nikita V. Shirokov @ 2018-08-21 16:58 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, jeffrey.t.kirsher

On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:
> On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> >
> > we are getting such errors:
> >
> > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
> >                  Tx Queue             <46>
> >                  TDH, TDT             <0>, <2>
> >                  next_to_use          <2>
> >                  next_to_clean        <0>
> >                tx_buffer_info[next_to_clean]
> >                  time_stamp           <0>
> >                  jiffies              <1000197c0>
> > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> >
> > while running XDP prog on ixgbe nic.
> > right now i'm seing this on bpfnext kernel
> > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> >
> > looks like this is the same issue as reported by Brenden in
> > https://www.spinics.net/lists/netdev/msg439438.html
> >
> > --
> > Nikita V. Shirokov
> 
> Could you provide some additional information about your setup.
> Specifically useful would be "ethtool -i", "ethtool -l", and lspci
> -vvv info for your device. The total number of CPUs on the system
> would be useful to know as well. In addition could you try
> reproducing
sure:

ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX:             0
TX:             0
Other:          1
Combined:       63
Current hardware settings:
RX:             0
TX:             0
Other:          1
Combined:       48

# ethtool -i eth0
driver: ixgbe
version: 5.1.0-k
firmware-version: 0x800006f1
expansion-rom-version:
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes


# nproc
48

lspci:

03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
        Subsystem: Intel Corporation Device 000d
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 30
        NUMA node: 0
        Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: I/O ports at 6000 [size=32]
        Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K]
        Expansion ROM at c7e00000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
                Vector table: BAR=4 offset=00000000
                PBA: BAR=4 offset=00002000
        Capabilities: [a0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+
                LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 <8us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 128, stride: 2, Device ID: 10ed
                Supported Page Size: 00000553, System Page Size: 00000001
                Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable)
                Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Kernel driver in use: ixgbe




workaround for now is to do the same, as Brenden did in his original
finding: make sure that combined + xdp queues < max_tx_queues
(e.g. w/ combined == 14 the issue goes away).

> the issue with one of the sample XDP programs provided with the kernel
> such as the xdp2 which I believe uses the XDP_TX function. We need to
> try and create a similar setup in our own environment for
> reproduction and debugging.

will try but this could take a while, because i'm not sure that we have
ixgbe in our test lab (and it would be hard to run such test in prod)

> 
> Thanks.
> 
> - Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ixgbe hangs when XDP_TX is enabled
  2018-08-21 16:58   ` Nikita V. Shirokov
@ 2018-08-21 18:13     ` Alexander Duyck
  2018-08-22 16:22       ` Jeff Kirsher
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2018-08-21 18:13 UTC (permalink / raw)
  To: tehnerd; +Cc: Netdev, Jeff Kirsher

On Tue, Aug 21, 2018 at 9:59 AM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
>
> On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:
> > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> > >
> > > we are getting such errors:
> > >
> > > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
> > >                  Tx Queue             <46>
> > >                  TDH, TDT             <0>, <2>
> > >                  next_to_use          <2>
> > >                  next_to_clean        <0>
> > >                tx_buffer_info[next_to_clean]
> > >                  time_stamp           <0>
> > >                  jiffies              <1000197c0>
> > > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> > > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> > > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> > > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> > >
> > > while running XDP prog on ixgbe nic.
> > > right now i'm seing this on bpfnext kernel
> > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> > >
> > > looks like this is the same issue as reported by Brenden in
> > > https://www.spinics.net/lists/netdev/msg439438.html
> > >
> > > --
> > > Nikita V. Shirokov
> >
> > Could you provide some additional information about your setup.
> > Specifically useful would be "ethtool -i", "ethtool -l", and lspci
> > -vvv info for your device. The total number of CPUs on the system
> > would be useful to know as well. In addition could you try
> > reproducing
> sure:
>
> ethtool -l eth0
> Channel parameters for eth0:
> Pre-set maximums:
> RX:             0
> TX:             0
> Other:          1
> Combined:       63
> Current hardware settings:
> RX:             0
> TX:             0
> Other:          1
> Combined:       48
>
> # ethtool -i eth0
> driver: ixgbe
> version: 5.1.0-k
> firmware-version: 0x800006f1
> expansion-rom-version:
> bus-info: 0000:03:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
>
>
> # nproc
> 48
>
> lspci:
>
> 03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
>         Subsystem: Intel Corporation Device 000d
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin A routed to IRQ 30
>         NUMA node: 0
>         Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M]
>         Region 2: I/O ports at 6000 [size=32]
>         Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K]
>         Expansion ROM at c7e00000 [disabled] [size=512K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>                 Address: 0000000000000000  Data: 0000
>                 Masking: 00000000  Pending: 00000000
>         Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
>                 Vector table: BAR=4 offset=00000000
>                 PBA: BAR=4 offset=00002000
>         Capabilities: [a0] Express (v2) Endpoint, MSI 00
>                 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
>                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+
>                 LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 <8us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>         Capabilities: [100 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60
>         Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
>                 ARICap: MFVC- ACS-, Next Function: 0
>                 ARICtl: MFVC- ACS-, Function Group: 0
>         Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
>                 IOVCap: Migration-, Interrupt Message Number: 000
>                 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
>                 IOVSta: Migration-
>                 Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
>                 VF offset: 128, stride: 2, Device ID: 10ed
>                 Supported Page Size: 00000553, System Page Size: 00000001
>                 Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable)
>                 Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable)
>                 VF Migration: offset: 00000000, BIR: 0
>         Kernel driver in use: ixgbe
>
>
>
>
> workaround for now is to do the same, as Brenden did in his original
> finding: make sure that combined + xdp queues < max_tx_queues
> (e.g. w/ combined == 14 the issue goes away).
>
> > the issue with one of the sample XDP programs provided with the kernel
> > such as the xdp2 which I believe uses the XDP_TX function. We need to
> > try and create a similar setup in our own environment for
> > reproduction and debugging.
>
> will try but this could take a while, because i'm not sure that we have
> ixgbe in our test lab (and it would be hard to run such test in prod)
>
> >
> > Thanks.
> >
> > - Alex
>
> --
> Nikita V. Shirokov

So I have been reading the datasheet
(https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf)
and it looks like the assumption that Brenden came to in the earlier
referenced link is probably correct. From what I can tell there is a
limit of 64 queues in the base RSS mode of the device, so while it
supports more than 64 queues you can only make use of 64 as per table
7-25.

For now I think the workaround you are using is probably the only
viable solution. I myself don't have time to work on resolving this,
but I am sure on of the maintainers for ixgbe will be responding
shortly.

One possible solution we may want to look at would be to make use of
the 32 pool/VF mode in the MTQC register. That should enable us to
make use of all 128 queues but I am sure there would be other side
effects such as having to set the bits in the PFVFTE register in order
to enable the extra Tx queues.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ixgbe hangs when XDP_TX is enabled
  2018-08-21 18:13     ` Alexander Duyck
@ 2018-08-22 16:22       ` Jeff Kirsher
  2018-08-24 14:25         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Kirsher @ 2018-08-22 16:22 UTC (permalink / raw)
  To: Alexander Duyck, tehnerd; +Cc: Netdev, tytus.a.wasilewski, Tymoteusz Kielan

[-- Attachment #1: Type: text/plain, Size: 9717 bytes --]

On Tue, 2018-08-21 at 11:13 -0700, Alexander Duyck wrote:
> On Tue, Aug 21, 2018 at 9:59 AM Nikita V. Shirokov <
> tehnerd@tehnerd.com> wrote:
> > 
> > On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:
> > > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <
> > > tehnerd@tehnerd.com> wrote:
> > > > 
> > > > we are getting such errors:
> > > > 
> > > > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang
> > > > (XDP)
> > > >                   Tx Queue             <46>
> > > >                   TDH, TDT             <0>, <2>
> > > >                   next_to_use          <2>
> > > >                   next_to_clean        <0>
> > > >                 tx_buffer_info[next_to_clean]
> > > >                   time_stamp           <0>
> > > >                   jiffies              <1000197c0>
> > > > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on
> > > > queue 46, resetting adapter
> > > > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to
> > > > tx timeout
> > > > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > > > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one
> > > > or more queues not cleared within the polling period
> > > > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > > > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps,
> > > > Flow Control: RX/TX
> > > > 
> > > > while running XDP prog on ixgbe nic.
> > > > right now i'm seing this on bpfnext kernel
> > > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > > > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> > > > 
> > > > looks like this is the same issue as reported by Brenden in
> > > > https://www.spinics.net/lists/netdev/msg439438.html
> > > > 
> > > > --
> > > > Nikita V. Shirokov
> > > 
> > > Could you provide some additional information about your setup.
> > > Specifically useful would be "ethtool -i", "ethtool -l", and
> > > lspci
> > > -vvv info for your device. The total number of CPUs on the system
> > > would be useful to know as well. In addition could you try
> > > reproducing
> > 
> > sure:
> > 
> > ethtool -l eth0
> > Channel parameters for eth0:
> > Pre-set maximums:
> > RX:             0
> > TX:             0
> > Other:          1
> > Combined:       63
> > Current hardware settings:
> > RX:             0
> > TX:             0
> > Other:          1
> > Combined:       48
> > 
> > # ethtool -i eth0
> > driver: ixgbe
> > version: 5.1.0-k
> > firmware-version: 0x800006f1
> > expansion-rom-version:
> > bus-info: 0000:03:00.0
> > supports-statistics: yes
> > supports-test: yes
> > supports-eeprom-access: yes
> > supports-register-dump: yes
> > supports-priv-flags: yes
> > 
> > 
> > # nproc
> > 48
> > 
> > lspci:
> > 
> > 03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> > SFI/SFP+ Network Connection (rev 01)
> >          Subsystem: Intel Corporation Device 000d
> >          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV-
> > VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
> >          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> > >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> >          Latency: 0, Cache Line Size: 32 bytes
> >          Interrupt: pin A routed to IRQ 30
> >          NUMA node: 0
> >          Region 0: Memory at c7d00000 (64-bit, non-prefetchable)
> > [size=1M]
> >          Region 2: I/O ports at 6000 [size=32]
> >          Region 4: Memory at c7e80000 (64-bit, non-prefetchable)
> > [size=16K]
> >          Expansion ROM at c7e00000 [disabled] [size=512K]
> >          Capabilities: [40] Power Management version 3
> >                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> > PME(D0+,D1-,D2-,D3hot+,D3cold+)
> >                  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1
> > PME-
> >          Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> >                  Address: 0000000000000000  Data: 0000
> >                  Masking: 00000000  Pending: 00000000
> >          Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
> >                  Vector table: BAR=4 offset=00000000
> >                  PBA: BAR=4 offset=00002000
> >          Capabilities: [a0] Express (v2) Endpoint, MSI 00
> >                  DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency
> > L0s <512ns, L1 <64us
> >                          ExtTag- AttnBtn- AttnInd- PwrInd- RBE+
> > FLReset+ SlotPowerLimit 0.000W
> >                  DevCtl: Report errors: Correctable+ Non-Fatal+
> > Fatal+ Unsupported+
> >                          RlxdOrd- ExtTag- PhantFunc- AuxPwr-
> > NoSnoop+ FLReset-
> >                          MaxPayload 256 bytes, MaxReadReq 512 bytes
> >                  DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+
> > AuxPwr+ TransPend+
> >                  LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s,
> > Exit Latency L0s unlimited, L1 <8us
> >                          ClockPM- Surprise- LLActRep- BwNot-
> > ASPMOptComp-
> >                  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled-
> > CommClk+
> >                          ExtSynch- ClockPM- AutWidDis- BWInt-
> > AutBWInt-
> >                  LnkSta: Speed 5GT/s, Width x8, TrErr- Train-
> > SlotClk+ DLActive- BWMgmt- ABWMgmt-
> >                  DevCap2: Completion Timeout: Range ABCD,
> > TimeoutDis+, LTR-, OBFF Not Supported
> >                  DevCtl2: Completion Timeout: 50us to 50ms,
> > TimeoutDis-, LTR-, OBFF Disabled
> >                  LnkCtl2: Target Link Speed: 5GT/s,
> > EnterCompliance- SpeedDis-
> >                           Transmit Margin: Normal Operating Range,
> > EnterModifiedCompliance- ComplianceSOS-
> >                           Compliance De-emphasis: -6dB
> >                  LnkSta2: Current De-emphasis Level: -6dB,
> > EqualizationComplete-, EqualizationPhase1-
> >                           EqualizationPhase2-, EqualizationPhase3-, 
> > LinkEqualizationRequest-
> >          Capabilities: [100 v1] Advanced Error Reporting
> >                  UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >                  UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> > UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >                  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
> > UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> >                  CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> > NonFatalErr+
> >                  CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> > NonFatalErr+
> >                  AERCap: First Error Pointer: 00, GenCap+ CGenEn-
> > ChkCap+ ChkEn-
> >          Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-
> > ff-b6-b2-60
> >          Capabilities: [150 v1] Alternative Routing-ID
> > Interpretation (ARI)
> >                  ARICap: MFVC- ACS-, Next Function: 0
> >                  ARICtl: MFVC- ACS-, Function Group: 0
> >          Capabilities: [160 v1] Single Root I/O Virtualization (SR-
> > IOV)
> >                  IOVCap: Migration-, Interrupt Message Number: 000
> >                  IOVCtl: Enable- Migration- Interrupt- MSE-
> > ARIHierarchy+
> >                  IOVSta: Migration-
> >                  Initial VFs: 64, Total VFs: 64, Number of VFs: 0,
> > Function Dependency Link: 00
> >                  VF offset: 128, stride: 2, Device ID: 10ed
> >                  Supported Page Size: 00000553, System Page Size:
> > 00000001
> >                  Region 0: Memory at 00000000c7c00000 (64-bit,
> > prefetchable)
> >                  Region 3: Memory at 00000000c7b00000 (64-bit,
> > prefetchable)
> >                  VF Migration: offset: 00000000, BIR: 0
> >          Kernel driver in use: ixgbe
> > 
> > 
> > 
> > 
> > workaround for now is to do the same, as Brenden did in his
> > original
> > finding: make sure that combined + xdp queues < max_tx_queues
> > (e.g. w/ combined == 14 the issue goes away).
> > 
> > > the issue with one of the sample XDP programs provided with the
> > > kernel
> > > such as the xdp2 which I believe uses the XDP_TX function. We
> > > need to
> > > try and create a similar setup in our own environment for
> > > reproduction and debugging.
> > 
> > will try but this could take a while, because i'm not sure that we
> > have
> > ixgbe in our test lab (and it would be hard to run such test in
> > prod)
> > 
> > > 
> > > Thanks.
> > > 
> > > - Alex
> > 
> > --
> > Nikita V. Shirokov
> 
> So I have been reading the datasheet
> (
> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf
> )
> and it looks like the assumption that Brenden came to in the earlier
> referenced link is probably correct. From what I can tell there is a
> limit of 64 queues in the base RSS mode of the device, so while it
> supports more than 64 queues you can only make use of 64 as per table
> 7-25.
> 
> For now I think the workaround you are using is probably the only
> viable solution. I myself don't have time to work on resolving this,
> but I am sure on of the maintainers for ixgbe will be responding
> shortly.

I have notified the 10GbE maintainers, and we are working to reproduce
the issue currently.

> 
> One possible solution we may want to look at would be to make use of
> the 32 pool/VF mode in the MTQC register. That should enable us to
> make use of all 128 queues but I am sure there would be other side
> effects such as having to set the bits in the PFVFTE register in
> order
> to enable the extra Tx queues.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ixgbe hangs when XDP_TX is enabled
  2018-08-22 16:22       ` Jeff Kirsher
@ 2018-08-24 14:25         ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 6+ messages in thread
From: Jesper Dangaard Brouer @ 2018-08-24 14:25 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: brouer, Alexander Duyck, tehnerd, Netdev, tytus.a.wasilewski,
	Tymoteusz Kielan, John Fastabend, Daniel Borkmann,
	Alexei Starovoitov


On Wed, 22 Aug 2018 09:22:58 -0700 Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote:
> On Tue, 2018-08-21 at 11:13 -0700, Alexander Duyck wrote:
> > On Tue, Aug 21, 2018 at 9:59 AM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> > > 
> > > On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:  
> > > > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> > > > > 
> > > > > we are getting such errors:
> > > > > 
> > > > > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
> > > > >                   Tx Queue             <46>
> > > > >                   TDH, TDT             <0>, <2>
> > > > >                   next_to_use          <2>
> > > > >                   next_to_clean        <0>
> > > > >                 tx_buffer_info[next_to_clean]
> > > > >                   time_stamp           <0>
> > > > >                   jiffies              <1000197c0>
> > > > > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> > > > > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> > > > > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > > > > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> > > > > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > > > > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> > > > > 
> > > > > while running XDP prog on ixgbe nic.
> > > > > right now i'm seing this on bpfnext kernel
> > > > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > > > > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> > > > > 
> > > > > looks like this is the same issue as reported by Brenden in
> > > > > https://www.spinics.net/lists/netdev/msg439438.html
> > > > > 
> > > > [...] The total number of CPUs on the system
> > > > would be useful to know as well.
[...]
> > > # nproc
> > > 48
> > > 
[...]
> > > ethtool -l eth0
> > > Channel parameters for eth0:
> > > Pre-set maximums:
> > > RX:             0
> > > TX:             0
> > > Other:          1
> > > Combined:       63
> > > Current hardware settings:
> > > RX:             0
> > > TX:             0
> > > Other:          1
> > > Combined:       48
[...]

> > > 
> > > workaround for now is to do the same, as Brenden did in his
> > > original
> > > finding: make sure that combined + xdp queues < max_tx_queues
> > > (e.g. w/ combined == 14 the issue goes away).
> > >   
> > > > the issue with one of the sample XDP programs provided with the
> > > > kernel such as the xdp2 which I believe uses the XDP_TX
> > > > function. We need to try and create a similar setup in our own
> > > > environment for reproduction and debugging.  
> > > 
> > > will try but this could take a while, because i'm not sure that we
> > > have ixgbe in our test lab (and it would be hard to run such test
> > > in prod)

Notice to reproduce you need a system with 48 cores. (I predict: for
less than 33 cores it will not show, and above 48 cores the XDP prog
should be rejected loading).


> > 
> > So I have been reading the datasheet
> > (
> > https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf
> > )
> > and it looks like the assumption that Brenden came to in the earlier
> > referenced link is probably correct. From what I can tell there is a
> > limit of 64 queues in the base RSS mode of the device, so while it
> > supports more than 64 queues you can only make use of 64 as per table
> > 7-25.
> > 

As far as I can remember, the driver code assumes up-to 96 queue are
avail.  It sounds like the driver XDP code that allocates 'one XDP
TX-queue per core' in the system are causing this.

I have previously complained that the ixgbe driver will be able to
enable XDP on machines with many CPU cores, due to the 'one XDP TX-queue
per core' design requirement.


> > For now I think the workaround you are using is probably the only
> > viable solution. I myself don't have time to work on resolving this,
> > but I am sure on of the maintainers for ixgbe will be responding
> > shortly.  
> 
> I have notified the 10GbE maintainers, and we are working to reproduce
> the issue currently.

For reproducers, notice the correlation with the number of cores the
system have.

 
> > One possible solution we may want to look at would be to make use of
> > the 32 pool/VF mode in the MTQC register. That should enable us to
> > make use of all 128 queues but I am sure there would be other side
> > effects such as having to set the bits in the PFVFTE register in
> > order to enable the extra Tx queues.  
 
Getting access to more queue is of-cause good, as it move the bar for
how many cores a system can have before XDP will no-longer work with
the ixgbe driver.

An alternative solution is also possible, but there will be a
performance trade off.  After merge commit 10f678683e4 ("Merge branch
'xdp_xmit-bulking'") the ndo_xdp_xmit() gets bulks of 16 frames (limit
within devmap).  Thus, on systems that cannot allocate a NIC HW queue
foreach CPU core, can alternatively use locked XDP TX queue(s) (which
will be amortized due to bulking).  This mode will be slower, thus the
question is how do we "warn" the user, that this will be operating in a
slightly less optimal XDP-TX mode? (will a simple pr_info be enough,
like when there is insufficient PCIe BW).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-24 18:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-20 19:31 ixgbe hangs when XDP_TX is enabled Nikita V. Shirokov
2018-08-21 15:58 ` Alexander Duyck
2018-08-21 16:58   ` Nikita V. Shirokov
2018-08-21 18:13     ` Alexander Duyck
2018-08-22 16:22       ` Jeff Kirsher
2018-08-24 14:25         ` Jesper Dangaard Brouer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.