* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? @ 2020-12-18 18:08 Ben Greear 2020-12-18 19:27 ` Fujinaka, Todd 0 siblings, 1 reply; 22+ messages in thread From: Ben Greear @ 2020-12-18 18:08 UTC (permalink / raw) To: intel-wired-lan Hello, One of our users reports that our 5.10 kernel negotiates the 1/2.5/5/10g NIC to 1Gbps instead of 2.5Gbps. Booting the 5.4 kernel shows 2.5Gbps as expected. I have not yet tried to bisect this...is it a known issue? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-18 18:08 [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? Ben Greear @ 2020-12-18 19:27 ` Fujinaka, Todd 2020-12-18 19:41 ` Ben Greear 2020-12-18 19:43 ` Paul Menzel 0 siblings, 2 replies; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-18 19:27 UTC (permalink / raw) To: intel-wired-lan Yes, and I'm plugging the hole in the README right now. Here's the proposed text: Advertisements for 2.5G and 5G on the x550 were turned off by default due to interoperability issues with certain switches. To turn them back on, use ethtool -s <ethX> advertise N where N is a combination of the following. 100baseTFull 0x008 1000baseTFull 0x020 2500baseTFull 0x800000000000 5000baseTFull 0x1000000000000 10000baseTFull 0x1000 For example, to turn on all modes: ethtool -s <ethX> advertise 0x1800000001028 For more details please see the ethtool man page. Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ben Greear Sent: Friday, December 18, 2020 10:08 AM To: intel-wired-lan <intel-wired-lan@lists.osuosl.org> Subject: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? Hello, One of our users reports that our 5.10 kernel negotiates the 1/2.5/5/10g NIC to 1Gbps instead of 2.5Gbps. Booting the 5.4 kernel shows 2.5Gbps as expected. I have not yet tried to bisect this...is it a known issue? Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com _______________________________________________ Intel-wired-lan mailing list Intel-wired-lan at osuosl.org https://lists.osuosl.org/mailman/listinfo/intel-wired-lan ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-18 19:27 ` Fujinaka, Todd @ 2020-12-18 19:41 ` Ben Greear 2020-12-18 19:43 ` Paul Menzel 1 sibling, 0 replies; 22+ messages in thread From: Ben Greear @ 2020-12-18 19:41 UTC (permalink / raw) To: intel-wired-lan That is going to be a bit painful to implement. Any chance we can get a compile time option to enable 2.5 and 5Ghz by default, with user option to disable those rates in case they find problems? Thanks, Ben On 12/18/20 11:27 AM, Fujinaka, Todd wrote: > Yes, and I'm plugging the hole in the README right now. Here's the proposed text: > > Advertisements for 2.5G and 5G on the x550 were turned off by default due to > interoperability issues with certain switches. To turn them back on, use > > ethtool -s <ethX> advertise N > > where N is a combination of the following. > > 100baseTFull 0x008 > 1000baseTFull 0x020 > 2500baseTFull 0x800000000000 > 5000baseTFull 0x1000000000000 > 10000baseTFull 0x1000 > > For example, to turn on all modes: > ethtool -s <ethX> advertise 0x1800000001028 > > For more details please see the ethtool man page. > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ben Greear > Sent: Friday, December 18, 2020 10:08 AM > To: intel-wired-lan <intel-wired-lan@lists.osuosl.org> > Subject: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > Hello, > > One of our users reports that our 5.10 kernel negotiates the 1/2.5/5/10g NIC to 1Gbps instead of 2.5Gbps. Booting the 5.4 kernel shows 2.5Gbps as expected. > > I have not yet tried to bisect this...is it a known issue? > > Thanks, > Ben > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com _______________________________________________ > Intel-wired-lan mailing list > Intel-wired-lan at osuosl.org > https://lists.osuosl.org/mailman/listinfo/intel-wired-lan > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-18 19:27 ` Fujinaka, Todd 2020-12-18 19:41 ` Ben Greear @ 2020-12-18 19:43 ` Paul Menzel 2020-12-18 23:07 ` Ben Greear 1 sibling, 1 reply; 22+ messages in thread From: Paul Menzel @ 2020-12-18 19:43 UTC (permalink / raw) To: intel-wired-lan Dear Todd, Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: > Yes, and I'm plugging the hole in the README right now. Here's the proposed text: > > Advertisements for 2.5G and 5G on the x550 were turned off by default due to > interoperability issues with certain switches. To turn them back on, use > > ethtool -s <ethX> advertise N > > where N is a combination of the following. > > 100baseTFull 0x008 > 1000baseTFull 0x020 > 2500baseTFull 0x800000000000 > 5000baseTFull 0x1000000000000 > 10000baseTFull 0x1000 > > For example, to turn on all modes: > ethtool -s <ethX> advertise 0x1800000001028 > > For more details please see the ethtool man page. What commit introduced this regression. Please bear in mind, that this contradicts Linux? no-regression policy, and the commit should therefore be reverted as soon as possible. Kind regards, Paul ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-18 19:43 ` Paul Menzel @ 2020-12-18 23:07 ` Ben Greear 2020-12-18 23:19 ` Paul Menzel 0 siblings, 1 reply; 22+ messages in thread From: Ben Greear @ 2020-12-18 23:07 UTC (permalink / raw) To: intel-wired-lan On 12/18/20 11:43 AM, Paul Menzel wrote: > Dear Todd, > > > Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >> Yes, and I'm plugging the hole in the README right now. Here's the proposed text: >> >> Advertisements for 2.5G and 5G on the x550 were turned off by default due to >> interoperability issues with certain switches. To turn them back on, use >> >> ethtool -s <ethX> advertise N >> >> where N is a combination of the following. >> >> 100baseTFull??? 0x008 >> 1000baseTFull?? 0x020 >> 2500baseTFull?? 0x800000000000 >> 5000baseTFull?? 0x1000000000000 >> 10000baseTFull? 0x1000 >> >> For example, to turn on all modes: >> ethtool -s <ethX> advertise 0x1800000001028 >> >> For more details please see the ethtool man page. > > What commit introduced this regression. Please bear in mind, that this contradicts Linux? no-regression policy, and the commit should therefore be reverted as > soon as possible. Looks like it is at the end of this patch, though the description doesn't mention changing defaults: Commit a296d665eae1e8ec6445683bfb999c884058426a Author: Radoslaw Tyl <radoslawx.tyl@intel.com> Date: Fri Jun 26 15:28:14 2020 +0200 ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support Added full support for new version Ethtool API. New API allow use 2500Gbase-T and 5000base-T supported and advertised link speed modes. Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Thanks, Ben > > > Kind regards, > > Paul > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-18 23:07 ` Ben Greear @ 2020-12-18 23:19 ` Paul Menzel 2020-12-19 0:09 ` Fujinaka, Todd 0 siblings, 1 reply; 22+ messages in thread From: Paul Menzel @ 2020-12-18 23:19 UTC (permalink / raw) To: intel-wired-lan [+cc Radoslaw, Aleksandr, Piotr] Am 19.12.20 um 00:07 schrieb Ben Greear: > On 12/18/20 11:43 AM, Paul Menzel wrote: >> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>> Yes, and I'm plugging the hole in the README right now. Here's the >>> proposed text: >>> >>> Advertisements for 2.5G and 5G on the x550 were turned off by default >>> due to >>> interoperability issues with certain switches. To turn them back on, use >>> >>> ethtool -s <ethX> advertise N >>> >>> where N is a combination of the following. >>> >>> 100baseTFull??? 0x008 >>> 1000baseTFull?? 0x020 >>> 2500baseTFull?? 0x800000000000 >>> 5000baseTFull?? 0x1000000000000 >>> 10000baseTFull? 0x1000 >>> >>> For example, to turn on all modes: >>> ethtool -s <ethX> advertise 0x1800000001028 >>> >>> For more details please see the ethtool man page. >> >> What commit introduced this regression. Please bear in mind, that this >> contradicts Linux? no-regression policy, and the commit should >> therefore be reverted as soon as possible. > > Looks like it is at the end of this patch, though the description doesn't > mention changing defaults: > > Commit a296d665eae1e8ec6445683bfb999c884058426a > Author: Radoslaw Tyl <radoslawx.tyl@intel.com> > Date:?? Fri Jun 26 15:28:14 2020 +0200 > > ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support > > ??? Added full support for new version Ethtool API. New API allow use > ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. > > ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> > ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> > ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> > > Thanks, > Ben ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-18 23:19 ` Paul Menzel @ 2020-12-19 0:09 ` Fujinaka, Todd 2020-12-19 0:47 ` Ben Greear 2020-12-19 7:54 ` Paul Menzel 0 siblings, 2 replies; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-19 0:09 UTC (permalink / raw) To: intel-wired-lan What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. We had several escalations from major OEMs and this was the solution they wanted. We consider this necessary for interoperability. Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Paul Menzel <pmenzel@molgen.mpg.de> Sent: Friday, December 18, 2020 3:19 PM To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd <todd.fujinaka@intel.com> Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Tyl, RadoslawX <radoslawx.tyl@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Mclean, Arthur F <arthur.f.mclean@intel.com>; Skajewski, PiotrX <piotrx.skajewski@intel.com> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? [+cc Radoslaw, Aleksandr, Piotr] Am 19.12.20 um 00:07 schrieb Ben Greear: > On 12/18/20 11:43 AM, Paul Menzel wrote: >> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>> Yes, and I'm plugging the hole in the README right now. Here's the >>> proposed text: >>> >>> Advertisements for 2.5G and 5G on the x550 were turned off by >>> default due to interoperability issues with certain switches. To >>> turn them back on, use >>> >>> ethtool -s <ethX> advertise N >>> >>> where N is a combination of the following. >>> >>> 100baseTFull??? 0x008 >>> 1000baseTFull?? 0x020 >>> 2500baseTFull?? 0x800000000000 >>> 5000baseTFull?? 0x1000000000000 >>> 10000baseTFull? 0x1000 >>> >>> For example, to turn on all modes: >>> ethtool -s <ethX> advertise 0x1800000001028 >>> >>> For more details please see the ethtool man page. >> >> What commit introduced this regression. Please bear in mind, that >> this contradicts Linux? no-regression policy, and the commit should >> therefore be reverted as soon as possible. > > Looks like it is at the end of this patch, though the description > doesn't mention changing defaults: > > Commit a296d665eae1e8ec6445683bfb999c884058426a > Author: Radoslaw Tyl <radoslawx.tyl@intel.com> > Date:?? Fri Jun 26 15:28:14 2020 +0200 > > ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support > > ??? Added full support for new version Ethtool API. New API allow use > ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. > > ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> > ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> > ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> > > Thanks, > Ben ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 0:09 ` Fujinaka, Todd @ 2020-12-19 0:47 ` Ben Greear 2020-12-19 16:19 ` Fujinaka, Todd 2020-12-19 7:54 ` Paul Menzel 1 sibling, 1 reply; 22+ messages in thread From: Ben Greear @ 2020-12-19 0:47 UTC (permalink / raw) To: intel-wired-lan On 12/18/20 4:09 PM, Fujinaka, Todd wrote: > What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? > > We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. > > We had several escalations from major OEMs and this was the solution they wanted. > > We consider this necessary for interoperability. Can you detect this case somehow and automatically fall-back to 1Gbps? For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. Thanks, Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Paul Menzel <pmenzel@molgen.mpg.de> > Sent: Friday, December 18, 2020 3:19 PM > To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd <todd.fujinaka@intel.com> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Tyl, RadoslawX <radoslawx.tyl@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Mclean, Arthur F <arthur.f.mclean@intel.com>; Skajewski, PiotrX <piotrx.skajewski@intel.com> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > [+cc Radoslaw, Aleksandr, Piotr] > > Am 19.12.20 um 00:07 schrieb Ben Greear: >> On 12/18/20 11:43 AM, Paul Menzel wrote: > >>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>> proposed text: >>>> >>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>> default due to interoperability issues with certain switches. To >>>> turn them back on, use >>>> >>>> ethtool -s <ethX> advertise N >>>> >>>> where N is a combination of the following. >>>> >>>> 100baseTFull??? 0x008 >>>> 1000baseTFull?? 0x020 >>>> 2500baseTFull?? 0x800000000000 >>>> 5000baseTFull?? 0x1000000000000 >>>> 10000baseTFull? 0x1000 >>>> >>>> For example, to turn on all modes: >>>> ethtool -s <ethX> advertise 0x1800000001028 >>>> >>>> For more details please see the ethtool man page. >>> >>> What commit introduced this regression. Please bear in mind, that >>> this contradicts Linux? no-regression policy, and the commit should >>> therefore be reverted as soon as possible. >> >> Looks like it is at the end of this patch, though the description >> doesn't mention changing defaults: >> >> Commit a296d665eae1e8ec6445683bfb999c884058426a >> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >> Date:?? Fri Jun 26 15:28:14 2020 +0200 >> >> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >> >> ??? Added full support for new version Ethtool API. New API allow use >> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >> >> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >> >> Thanks, >> Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 0:47 ` Ben Greear @ 2020-12-19 16:19 ` Fujinaka, Todd 2020-12-19 16:48 ` Ben Greear 2020-12-21 15:09 ` [Intel-wired-lan] ixgbe: " Paul Menzel 0 siblings, 2 replies; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-19 16:19 UTC (permalink / raw) To: intel-wired-lan This is a bad case with no ideal solution. Detecting the case is not possible as autonegotiation happens in the hardware without software involvement. One solution was to update the switch firmware for the a switch that is is the link partner that give us the most trouble. The issue appears to be in competing or half-implemented standards. 2.5G and 5G were initially non-IEEE standards that different manufacturers hacked onto 1G in different ways. We implemented it to one of the standards which should be interoperable, but the corner case of the widely-deployed switch will take the link from 10G to 1G with no automated way to fix it. Updating switches means a lot of downtime for a lot of datacenters and the OEMs we deal with would not accept that answer. Our solution was to disable 2.5G and 5G by default. This fixes 10G linking at 1G on that switch, but 2.5G and 5G will link at 1G by default. And, as I said, I've had very little contact with people using 2.5G and 5G and I'm the guy on all the mailing lists. I apologize for making your life harder, but it seems like it's just you so far. Paul seems to be arguing with me just for the fun of it. Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Ben Greear <greearb@candelatech.com> Sent: Friday, December 18, 2020 4:47 PM To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel <pmenzel@molgen.mpg.de> Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? On 12/18/20 4:09 PM, Fujinaka, Todd wrote: > What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? > > We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. > > We had several escalations from major OEMs and this was the solution they wanted. > > We consider this necessary for interoperability. Can you detect this case somehow and automatically fall-back to 1Gbps? For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. Thanks, Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Paul Menzel <pmenzel@molgen.mpg.de> > Sent: Friday, December 18, 2020 3:19 PM > To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd > <todd.fujinaka@intel.com> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH > <gregkh@linuxfoundation.org>; Nguyen, Anthony L > <anthony.l.nguyen@intel.com>; Brandeburg, Jesse > <jesse.brandeburg@intel.com>; Tyl, RadoslawX > <radoslawx.tyl@intel.com>; Loktionov, Aleksandr > <aleksandr.loktionov@intel.com>; Mclean, Arthur F > <arthur.f.mclean@intel.com>; Skajewski, PiotrX > <piotrx.skajewski@intel.com> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > [+cc Radoslaw, Aleksandr, Piotr] > > Am 19.12.20 um 00:07 schrieb Ben Greear: >> On 12/18/20 11:43 AM, Paul Menzel wrote: > >>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>> proposed text: >>>> >>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>> default due to interoperability issues with certain switches. To >>>> turn them back on, use >>>> >>>> ethtool -s <ethX> advertise N >>>> >>>> where N is a combination of the following. >>>> >>>> 100baseTFull??? 0x008 >>>> 1000baseTFull?? 0x020 >>>> 2500baseTFull?? 0x800000000000 >>>> 5000baseTFull?? 0x1000000000000 >>>> 10000baseTFull? 0x1000 >>>> >>>> For example, to turn on all modes: >>>> ethtool -s <ethX> advertise 0x1800000001028 >>>> >>>> For more details please see the ethtool man page. >>> >>> What commit introduced this regression. Please bear in mind, that >>> this contradicts Linux? no-regression policy, and the commit should >>> therefore be reverted as soon as possible. >> >> Looks like it is at the end of this patch, though the description >> doesn't mention changing defaults: >> >> Commit a296d665eae1e8ec6445683bfb999c884058426a >> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >> Date:?? Fri Jun 26 15:28:14 2020 +0200 >> >> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >> >> ??? Added full support for new version Ethtool API. New API allow use >> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >> >> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >> >> Thanks, >> Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 16:19 ` Fujinaka, Todd @ 2020-12-19 16:48 ` Ben Greear 2020-12-21 15:20 ` Fujinaka, Todd 2020-12-21 15:09 ` [Intel-wired-lan] ixgbe: " Paul Menzel 1 sibling, 1 reply; 22+ messages in thread From: Ben Greear @ 2020-12-19 16:48 UTC (permalink / raw) To: intel-wired-lan On 12/19/20 8:19 AM, Fujinaka, Todd wrote: > This is a bad case with no ideal solution. Detecting the case is not possible as autonegotiation happens in the hardware without software involvement. > So, after it negotiates to 2.5, what happens? Do you see lots of low-level crc errors or similar? Maybe you can use that to determine link is bad and force it back to 1Gbps and re-negotiate link? (And with nice visible warning in dmesg about what is going on) > One solution was to update the switch firmware for the a switch that is is the link partner that give us the most trouble. The issue appears to be in competing or half-implemented standards. 2.5G and 5G were initially non-IEEE standards that different manufacturers hacked onto 1G in different ways. We implemented it to one of the standards which should be interoperable, but the corner case of the widely-deployed switch will take the link from 10G to 1G with no automated way to fix it. > > Updating switches means a lot of downtime for a lot of datacenters and the OEMs we deal with would not accept that answer. > > Our solution was to disable 2.5G and 5G by default. This fixes 10G linking at 1G on that switch, but 2.5G and 5G will link at 1G by default. And, as I said, I've had very little contact with people using 2.5G and 5G and I'm the guy on all the mailing lists. I apologize for making your life harder, but it seems like it's just you so far. Paul seems to be arguing with me just for the fun of it. Well, when things work, no one talks about it. :) Are you able to determine that peer is advertising 2.5, and local NIC is forced to 1G, and then put a visible warning in dmesg about this case and link to how to enable 2.5/5G rates? That might help people realize what is going on. And when you do this commit, put a lot of notes about why and about what commit changed things since it is not at all obvious from the original commit message. Thanks, Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Ben Greear <greearb@candelatech.com> > Sent: Friday, December 18, 2020 4:47 PM > To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel <pmenzel@molgen.mpg.de> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > On 12/18/20 4:09 PM, Fujinaka, Todd wrote: >> What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? >> >> We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. >> >> We had several escalations from major OEMs and this was the solution they wanted. >> >> We consider this necessary for interoperability. > > Can you detect this case somehow and automatically fall-back to 1Gbps? > > For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. > > Thanks, > Ben > >> >> Todd Fujinaka >> Software Application Engineer >> Data Center Group >> Intel Corporation >> todd.fujinaka at intel.com >> >> -----Original Message----- >> From: Paul Menzel <pmenzel@molgen.mpg.de> >> Sent: Friday, December 18, 2020 3:19 PM >> To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd >> <todd.fujinaka@intel.com> >> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >> <gregkh@linuxfoundation.org>; Nguyen, Anthony L >> <anthony.l.nguyen@intel.com>; Brandeburg, Jesse >> <jesse.brandeburg@intel.com>; Tyl, RadoslawX >> <radoslawx.tyl@intel.com>; Loktionov, Aleksandr >> <aleksandr.loktionov@intel.com>; Mclean, Arthur F >> <arthur.f.mclean@intel.com>; Skajewski, PiotrX >> <piotrx.skajewski@intel.com> >> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >> >> [+cc Radoslaw, Aleksandr, Piotr] >> >> Am 19.12.20 um 00:07 schrieb Ben Greear: >>> On 12/18/20 11:43 AM, Paul Menzel wrote: >> >>>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>>> proposed text: >>>>> >>>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>>> default due to interoperability issues with certain switches. To >>>>> turn them back on, use >>>>> >>>>> ethtool -s <ethX> advertise N >>>>> >>>>> where N is a combination of the following. >>>>> >>>>> 100baseTFull??? 0x008 >>>>> 1000baseTFull?? 0x020 >>>>> 2500baseTFull?? 0x800000000000 >>>>> 5000baseTFull?? 0x1000000000000 >>>>> 10000baseTFull? 0x1000 >>>>> >>>>> For example, to turn on all modes: >>>>> ethtool -s <ethX> advertise 0x1800000001028 >>>>> >>>>> For more details please see the ethtool man page. >>>> >>>> What commit introduced this regression. Please bear in mind, that >>>> this contradicts Linux? no-regression policy, and the commit should >>>> therefore be reverted as soon as possible. >>> >>> Looks like it is at the end of this patch, though the description >>> doesn't mention changing defaults: >>> >>> Commit a296d665eae1e8ec6445683bfb999c884058426a >>> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >>> Date:?? Fri Jun 26 15:28:14 2020 +0200 >>> >>> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >>> >>> ??? Added full support for new version Ethtool API. New API allow use >>> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >>> >>> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >>> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >>> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >>> >>> Thanks, >>> Ben > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 16:48 ` Ben Greear @ 2020-12-21 15:20 ` Fujinaka, Todd 2020-12-21 15:52 ` Ben Greear 0 siblings, 1 reply; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-21 15:20 UTC (permalink / raw) To: intel-wired-lan Nope. The timing of the PHYs means the switch times out while we're trying 2.5G and 5G and the switch goes to its default lowest speed of 1G. Then we go to 1G and by that time bonding is broken in several of the cases we ran into. Basically, we can have that switch work, or we can have 2.5G and 5G on by default. Not both. And since we're selling a 10G device with other speeds as a bonus, we're prioritizing the highest speed. That plus the very high profile customers who wanted this solution. The solution for one camp or the other is to use the ethtool command at boot (I've forgotten exactly what that was) but the high profile customers refused to do that. Sounds like you're refusing as well? Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Ben Greear <greearb@candelatech.com> Sent: Saturday, December 19, 2020 8:48 AM To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel <pmenzel@molgen.mpg.de> Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? On 12/19/20 8:19 AM, Fujinaka, Todd wrote: > This is a bad case with no ideal solution. Detecting the case is not possible as autonegotiation happens in the hardware without software involvement. > So, after it negotiates to 2.5, what happens? Do you see lots of low-level crc errors or similar? Maybe you can use that to determine link is bad and force it back to 1Gbps and re-negotiate link? (And with nice visible warning in dmesg about what is going on) > One solution was to update the switch firmware for the a switch that is is the link partner that give us the most trouble. The issue appears to be in competing or half-implemented standards. 2.5G and 5G were initially non-IEEE standards that different manufacturers hacked onto 1G in different ways. We implemented it to one of the standards which should be interoperable, but the corner case of the widely-deployed switch will take the link from 10G to 1G with no automated way to fix it. > > Updating switches means a lot of downtime for a lot of datacenters and the OEMs we deal with would not accept that answer. > > Our solution was to disable 2.5G and 5G by default. This fixes 10G linking at 1G on that switch, but 2.5G and 5G will link at 1G by default. And, as I said, I've had very little contact with people using 2.5G and 5G and I'm the guy on all the mailing lists. I apologize for making your life harder, but it seems like it's just you so far. Paul seems to be arguing with me just for the fun of it. Well, when things work, no one talks about it. :) Are you able to determine that peer is advertising 2.5, and local NIC is forced to 1G, and then put a visible warning in dmesg about this case and link to how to enable 2.5/5G rates? That might help people realize what is going on. And when you do this commit, put a lot of notes about why and about what commit changed things since it is not at all obvious from the original commit message. Thanks, Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Ben Greear <greearb@candelatech.com> > Sent: Friday, December 18, 2020 4:47 PM > To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel > <pmenzel@molgen.mpg.de> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH > <gregkh@linuxfoundation.org> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > On 12/18/20 4:09 PM, Fujinaka, Todd wrote: >> What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? >> >> We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. >> >> We had several escalations from major OEMs and this was the solution they wanted. >> >> We consider this necessary for interoperability. > > Can you detect this case somehow and automatically fall-back to 1Gbps? > > For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. > > Thanks, > Ben > >> >> Todd Fujinaka >> Software Application Engineer >> Data Center Group >> Intel Corporation >> todd.fujinaka at intel.com >> >> -----Original Message----- >> From: Paul Menzel <pmenzel@molgen.mpg.de> >> Sent: Friday, December 18, 2020 3:19 PM >> To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd >> <todd.fujinaka@intel.com> >> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >> <gregkh@linuxfoundation.org>; Nguyen, Anthony L >> <anthony.l.nguyen@intel.com>; Brandeburg, Jesse >> <jesse.brandeburg@intel.com>; Tyl, RadoslawX >> <radoslawx.tyl@intel.com>; Loktionov, Aleksandr >> <aleksandr.loktionov@intel.com>; Mclean, Arthur F >> <arthur.f.mclean@intel.com>; Skajewski, PiotrX >> <piotrx.skajewski@intel.com> >> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >> >> [+cc Radoslaw, Aleksandr, Piotr] >> >> Am 19.12.20 um 00:07 schrieb Ben Greear: >>> On 12/18/20 11:43 AM, Paul Menzel wrote: >> >>>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>>> proposed text: >>>>> >>>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>>> default due to interoperability issues with certain switches. To >>>>> turn them back on, use >>>>> >>>>> ethtool -s <ethX> advertise N >>>>> >>>>> where N is a combination of the following. >>>>> >>>>> 100baseTFull??? 0x008 >>>>> 1000baseTFull?? 0x020 >>>>> 2500baseTFull?? 0x800000000000 >>>>> 5000baseTFull?? 0x1000000000000 >>>>> 10000baseTFull? 0x1000 >>>>> >>>>> For example, to turn on all modes: >>>>> ethtool -s <ethX> advertise 0x1800000001028 >>>>> >>>>> For more details please see the ethtool man page. >>>> >>>> What commit introduced this regression. Please bear in mind, that >>>> this contradicts Linux? no-regression policy, and the commit should >>>> therefore be reverted as soon as possible. >>> >>> Looks like it is at the end of this patch, though the description >>> doesn't mention changing defaults: >>> >>> Commit a296d665eae1e8ec6445683bfb999c884058426a >>> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >>> Date:?? Fri Jun 26 15:28:14 2020 +0200 >>> >>> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >>> >>> ??? Added full support for new version Ethtool API. New API allow use >>> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >>> >>> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >>> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >>> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >>> >>> Thanks, >>> Ben > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-21 15:20 ` Fujinaka, Todd @ 2020-12-21 15:52 ` Ben Greear 2020-12-21 15:58 ` Fujinaka, Todd 2020-12-22 8:59 ` Greg KH 0 siblings, 2 replies; 22+ messages in thread From: Ben Greear @ 2020-12-21 15:52 UTC (permalink / raw) To: intel-wired-lan On 12/21/20 7:20 AM, Fujinaka, Todd wrote: > Nope. The timing of the PHYs means the switch times out while we're trying 2.5G and 5G and the switch goes to its default lowest speed of 1G. Then we go to 1G and by that time bonding is broken in several of the cases we ran into. > > Basically, we can have that switch work, or we can have 2.5G and 5G on by default. Not both. And since we're selling a 10G device with other speeds as a bonus, we're prioritizing the highest speed. That plus the very high profile customers who wanted this solution. > > The solution for one camp or the other is to use the ethtool command at boot (I've forgotten exactly what that was) but the high profile customers refused to do that. Sounds like you're refusing as well? I'm not refusing, I just would rather patch my kernels than use ethtool, that way my older user-space would work fine on newer kernels. Would you accept a patch that makes this a module option, defaulted to disable 2.5/5, but which a user could enabled to enable 2.5/5 by default? I'd find that easier to use that the ethtool modification, and of course ethtool could still override things as desired. Thanks, Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Ben Greear <greearb@candelatech.com> > Sent: Saturday, December 19, 2020 8:48 AM > To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel <pmenzel@molgen.mpg.de> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > On 12/19/20 8:19 AM, Fujinaka, Todd wrote: >> This is a bad case with no ideal solution. Detecting the case is not possible as autonegotiation happens in the hardware without software involvement. >> > > So, after it negotiates to 2.5, what happens? Do you see lots of low-level crc errors or similar? > Maybe you can use that to determine link is bad and force it back to 1Gbps and re-negotiate link? > > (And with nice visible warning in dmesg about what is going on) > >> One solution was to update the switch firmware for the a switch that is is the link partner that give us the most trouble. The issue appears to be in competing or half-implemented standards. 2.5G and 5G were initially non-IEEE standards that different manufacturers hacked onto 1G in different ways. We implemented it to one of the standards which should be interoperable, but the corner case of the widely-deployed switch will take the link from 10G to 1G with no automated way to fix it. >> >> Updating switches means a lot of downtime for a lot of datacenters and the OEMs we deal with would not accept that answer. >> >> Our solution was to disable 2.5G and 5G by default. This fixes 10G linking at 1G on that switch, but 2.5G and 5G will link at 1G by default. And, as I said, I've had very little contact with people using 2.5G and 5G and I'm the guy on all the mailing lists. I apologize for making your life harder, but it seems like it's just you so far. Paul seems to be arguing with me just for the fun of it. > > Well, when things work, no one talks about it. :) > > Are you able to determine that peer is advertising 2.5, and local NIC is forced to 1G, and then put a visible warning in dmesg about this case and link to how to enable 2.5/5G rates? That might help people realize what is going on. And when you do this commit, put a lot of notes about why and about what commit changed things since it is not at all obvious from the original commit message. > > Thanks, > Ben > >> >> Todd Fujinaka >> Software Application Engineer >> Data Center Group >> Intel Corporation >> todd.fujinaka at intel.com >> >> -----Original Message----- >> From: Ben Greear <greearb@candelatech.com> >> Sent: Friday, December 18, 2020 4:47 PM >> To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel >> <pmenzel@molgen.mpg.de> >> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >> <gregkh@linuxfoundation.org> >> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >> >> On 12/18/20 4:09 PM, Fujinaka, Todd wrote: >>> What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? >>> >>> We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. >>> >>> We had several escalations from major OEMs and this was the solution they wanted. >>> >>> We consider this necessary for interoperability. >> >> Can you detect this case somehow and automatically fall-back to 1Gbps? >> >> For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. >> >> Thanks, >> Ben >> >>> >>> Todd Fujinaka >>> Software Application Engineer >>> Data Center Group >>> Intel Corporation >>> todd.fujinaka at intel.com >>> >>> -----Original Message----- >>> From: Paul Menzel <pmenzel@molgen.mpg.de> >>> Sent: Friday, December 18, 2020 3:19 PM >>> To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd >>> <todd.fujinaka@intel.com> >>> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >>> <gregkh@linuxfoundation.org>; Nguyen, Anthony L >>> <anthony.l.nguyen@intel.com>; Brandeburg, Jesse >>> <jesse.brandeburg@intel.com>; Tyl, RadoslawX >>> <radoslawx.tyl@intel.com>; Loktionov, Aleksandr >>> <aleksandr.loktionov@intel.com>; Mclean, Arthur F >>> <arthur.f.mclean@intel.com>; Skajewski, PiotrX >>> <piotrx.skajewski@intel.com> >>> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >>> >>> [+cc Radoslaw, Aleksandr, Piotr] >>> >>> Am 19.12.20 um 00:07 schrieb Ben Greear: >>>> On 12/18/20 11:43 AM, Paul Menzel wrote: >>> >>>>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>>>> proposed text: >>>>>> >>>>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>>>> default due to interoperability issues with certain switches. To >>>>>> turn them back on, use >>>>>> >>>>>> ethtool -s <ethX> advertise N >>>>>> >>>>>> where N is a combination of the following. >>>>>> >>>>>> 100baseTFull??? 0x008 >>>>>> 1000baseTFull?? 0x020 >>>>>> 2500baseTFull?? 0x800000000000 >>>>>> 5000baseTFull?? 0x1000000000000 >>>>>> 10000baseTFull? 0x1000 >>>>>> >>>>>> For example, to turn on all modes: >>>>>> ethtool -s <ethX> advertise 0x1800000001028 >>>>>> >>>>>> For more details please see the ethtool man page. >>>>> >>>>> What commit introduced this regression. Please bear in mind, that >>>>> this contradicts Linux? no-regression policy, and the commit should >>>>> therefore be reverted as soon as possible. >>>> >>>> Looks like it is at the end of this patch, though the description >>>> doesn't mention changing defaults: >>>> >>>> Commit a296d665eae1e8ec6445683bfb999c884058426a >>>> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >>>> Date:?? Fri Jun 26 15:28:14 2020 +0200 >>>> >>>> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >>>> >>>> ??? Added full support for new version Ethtool API. New API allow use >>>> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >>>> >>>> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >>>> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >>>> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >>>> >>>> Thanks, >>>> Ben >> >> >> -- >> Ben Greear <greearb@candelatech.com> >> Candela Technologies Inc http://www.candelatech.com >> > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-21 15:52 ` Ben Greear @ 2020-12-21 15:58 ` Fujinaka, Todd 2020-12-21 16:04 ` Ben Greear 2020-12-22 8:59 ` Greg KH 1 sibling, 1 reply; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-21 15:58 UTC (permalink / raw) To: intel-wired-lan For the standalone driver? I can certainly ask for the change. It might take a while (knowing what's going on here) but I can champion that. As for in-kernel, I think Intel wants to keep it this way. Not saying Intel won't be outvoted, but this is what has been demanded by the customer so far. Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Ben Greear <greearb@candelatech.com> Sent: Monday, December 21, 2020 7:53 AM To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel <pmenzel@molgen.mpg.de> Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? On 12/21/20 7:20 AM, Fujinaka, Todd wrote: > Nope. The timing of the PHYs means the switch times out while we're trying 2.5G and 5G and the switch goes to its default lowest speed of 1G. Then we go to 1G and by that time bonding is broken in several of the cases we ran into. > > Basically, we can have that switch work, or we can have 2.5G and 5G on by default. Not both. And since we're selling a 10G device with other speeds as a bonus, we're prioritizing the highest speed. That plus the very high profile customers who wanted this solution. > > The solution for one camp or the other is to use the ethtool command at boot (I've forgotten exactly what that was) but the high profile customers refused to do that. Sounds like you're refusing as well? I'm not refusing, I just would rather patch my kernels than use ethtool, that way my older user-space would work fine on newer kernels. Would you accept a patch that makes this a module option, defaulted to disable 2.5/5, but which a user could enabled to enable 2.5/5 by default? I'd find that easier to use that the ethtool modification, and of course ethtool could still override things as desired. Thanks, Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Ben Greear <greearb@candelatech.com> > Sent: Saturday, December 19, 2020 8:48 AM > To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel > <pmenzel@molgen.mpg.de> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH > <gregkh@linuxfoundation.org> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > On 12/19/20 8:19 AM, Fujinaka, Todd wrote: >> This is a bad case with no ideal solution. Detecting the case is not possible as autonegotiation happens in the hardware without software involvement. >> > > So, after it negotiates to 2.5, what happens? Do you see lots of low-level crc errors or similar? > Maybe you can use that to determine link is bad and force it back to 1Gbps and re-negotiate link? > > (And with nice visible warning in dmesg about what is going on) > >> One solution was to update the switch firmware for the a switch that is is the link partner that give us the most trouble. The issue appears to be in competing or half-implemented standards. 2.5G and 5G were initially non-IEEE standards that different manufacturers hacked onto 1G in different ways. We implemented it to one of the standards which should be interoperable, but the corner case of the widely-deployed switch will take the link from 10G to 1G with no automated way to fix it. >> >> Updating switches means a lot of downtime for a lot of datacenters and the OEMs we deal with would not accept that answer. >> >> Our solution was to disable 2.5G and 5G by default. This fixes 10G linking at 1G on that switch, but 2.5G and 5G will link at 1G by default. And, as I said, I've had very little contact with people using 2.5G and 5G and I'm the guy on all the mailing lists. I apologize for making your life harder, but it seems like it's just you so far. Paul seems to be arguing with me just for the fun of it. > > Well, when things work, no one talks about it. :) > > Are you able to determine that peer is advertising 2.5, and local NIC is forced to 1G, and then put a visible warning in dmesg about this case and link to how to enable 2.5/5G rates? That might help people realize what is going on. And when you do this commit, put a lot of notes about why and about what commit changed things since it is not at all obvious from the original commit message. > > Thanks, > Ben > >> >> Todd Fujinaka >> Software Application Engineer >> Data Center Group >> Intel Corporation >> todd.fujinaka at intel.com >> >> -----Original Message----- >> From: Ben Greear <greearb@candelatech.com> >> Sent: Friday, December 18, 2020 4:47 PM >> To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel >> <pmenzel@molgen.mpg.de> >> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >> <gregkh@linuxfoundation.org> >> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >> >> On 12/18/20 4:09 PM, Fujinaka, Todd wrote: >>> What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? >>> >>> We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. >>> >>> We had several escalations from major OEMs and this was the solution they wanted. >>> >>> We consider this necessary for interoperability. >> >> Can you detect this case somehow and automatically fall-back to 1Gbps? >> >> For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. >> >> Thanks, >> Ben >> >>> >>> Todd Fujinaka >>> Software Application Engineer >>> Data Center Group >>> Intel Corporation >>> todd.fujinaka at intel.com >>> >>> -----Original Message----- >>> From: Paul Menzel <pmenzel@molgen.mpg.de> >>> Sent: Friday, December 18, 2020 3:19 PM >>> To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd >>> <todd.fujinaka@intel.com> >>> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >>> <gregkh@linuxfoundation.org>; Nguyen, Anthony L >>> <anthony.l.nguyen@intel.com>; Brandeburg, Jesse >>> <jesse.brandeburg@intel.com>; Tyl, RadoslawX >>> <radoslawx.tyl@intel.com>; Loktionov, Aleksandr >>> <aleksandr.loktionov@intel.com>; Mclean, Arthur F >>> <arthur.f.mclean@intel.com>; Skajewski, PiotrX >>> <piotrx.skajewski@intel.com> >>> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >>> >>> [+cc Radoslaw, Aleksandr, Piotr] >>> >>> Am 19.12.20 um 00:07 schrieb Ben Greear: >>>> On 12/18/20 11:43 AM, Paul Menzel wrote: >>> >>>>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>>>> Yes, and I'm plugging the hole in the README right now. Here's >>>>>> the proposed text: >>>>>> >>>>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>>>> default due to interoperability issues with certain switches. To >>>>>> turn them back on, use >>>>>> >>>>>> ethtool -s <ethX> advertise N >>>>>> >>>>>> where N is a combination of the following. >>>>>> >>>>>> 100baseTFull??? 0x008 >>>>>> 1000baseTFull?? 0x020 >>>>>> 2500baseTFull?? 0x800000000000 >>>>>> 5000baseTFull?? 0x1000000000000 >>>>>> 10000baseTFull? 0x1000 >>>>>> >>>>>> For example, to turn on all modes: >>>>>> ethtool -s <ethX> advertise 0x1800000001028 >>>>>> >>>>>> For more details please see the ethtool man page. >>>>> >>>>> What commit introduced this regression. Please bear in mind, that >>>>> this contradicts Linux? no-regression policy, and the commit >>>>> should therefore be reverted as soon as possible. >>>> >>>> Looks like it is at the end of this patch, though the description >>>> doesn't mention changing defaults: >>>> >>>> Commit a296d665eae1e8ec6445683bfb999c884058426a >>>> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >>>> Date:?? Fri Jun 26 15:28:14 2020 +0200 >>>> >>>> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps >>>> support >>>> >>>> ??? Added full support for new version Ethtool API. New API allow use >>>> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >>>> >>>> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >>>> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >>>> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >>>> >>>> Thanks, >>>> Ben >> >> >> -- >> Ben Greear <greearb@candelatech.com> >> Candela Technologies Inc http://www.candelatech.com >> > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-21 15:58 ` Fujinaka, Todd @ 2020-12-21 16:04 ` Ben Greear 0 siblings, 0 replies; 22+ messages in thread From: Ben Greear @ 2020-12-21 16:04 UTC (permalink / raw) To: intel-wired-lan On 12/21/20 7:58 AM, Fujinaka, Todd wrote: > For the standalone driver? I can certainly ask for the change. It might take a while (knowing what's going on here) but I can champion that. > > As for in-kernel, I think Intel wants to keep it this way. Not saying Intel won't be outvoted, but this is what has been demanded by the customer so far. Out of kernel driver doesn't help me personally. I'll let you all figure it out, will just patch my own accordingly. Thanks for the quick response to my original query. --Ben > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Ben Greear <greearb@candelatech.com> > Sent: Monday, December 21, 2020 7:53 AM > To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel <pmenzel@molgen.mpg.de> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > On 12/21/20 7:20 AM, Fujinaka, Todd wrote: >> Nope. The timing of the PHYs means the switch times out while we're trying 2.5G and 5G and the switch goes to its default lowest speed of 1G. Then we go to 1G and by that time bonding is broken in several of the cases we ran into. >> >> Basically, we can have that switch work, or we can have 2.5G and 5G on by default. Not both. And since we're selling a 10G device with other speeds as a bonus, we're prioritizing the highest speed. That plus the very high profile customers who wanted this solution. >> >> The solution for one camp or the other is to use the ethtool command at boot (I've forgotten exactly what that was) but the high profile customers refused to do that. Sounds like you're refusing as well? > > I'm not refusing, I just would rather patch my kernels than use ethtool, that way my older user-space would work fine on newer kernels. > > Would you accept a patch that makes this a module option, defaulted to disable 2.5/5, but which a user could enabled to enable 2.5/5 by default? > > I'd find that easier to use that the ethtool modification, and of course ethtool could still override things as desired. > > Thanks, > Ben > >> >> Todd Fujinaka >> Software Application Engineer >> Data Center Group >> Intel Corporation >> todd.fujinaka at intel.com >> >> -----Original Message----- >> From: Ben Greear <greearb@candelatech.com> >> Sent: Saturday, December 19, 2020 8:48 AM >> To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel >> <pmenzel@molgen.mpg.de> >> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >> <gregkh@linuxfoundation.org> >> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >> >> On 12/19/20 8:19 AM, Fujinaka, Todd wrote: >>> This is a bad case with no ideal solution. Detecting the case is not possible as autonegotiation happens in the hardware without software involvement. >>> >> >> So, after it negotiates to 2.5, what happens? Do you see lots of low-level crc errors or similar? >> Maybe you can use that to determine link is bad and force it back to 1Gbps and re-negotiate link? >> >> (And with nice visible warning in dmesg about what is going on) >> >>> One solution was to update the switch firmware for the a switch that is is the link partner that give us the most trouble. The issue appears to be in competing or half-implemented standards. 2.5G and 5G were initially non-IEEE standards that different manufacturers hacked onto 1G in different ways. We implemented it to one of the standards which should be interoperable, but the corner case of the widely-deployed switch will take the link from 10G to 1G with no automated way to fix it. >>> >>> Updating switches means a lot of downtime for a lot of datacenters and the OEMs we deal with would not accept that answer. >>> >>> Our solution was to disable 2.5G and 5G by default. This fixes 10G linking at 1G on that switch, but 2.5G and 5G will link at 1G by default. And, as I said, I've had very little contact with people using 2.5G and 5G and I'm the guy on all the mailing lists. I apologize for making your life harder, but it seems like it's just you so far. Paul seems to be arguing with me just for the fun of it. >> >> Well, when things work, no one talks about it. :) >> >> Are you able to determine that peer is advertising 2.5, and local NIC is forced to 1G, and then put a visible warning in dmesg about this case and link to how to enable 2.5/5G rates? That might help people realize what is going on. And when you do this commit, put a lot of notes about why and about what commit changed things since it is not at all obvious from the original commit message. >> >> Thanks, >> Ben >> >>> >>> Todd Fujinaka >>> Software Application Engineer >>> Data Center Group >>> Intel Corporation >>> todd.fujinaka at intel.com >>> >>> -----Original Message----- >>> From: Ben Greear <greearb@candelatech.com> >>> Sent: Friday, December 18, 2020 4:47 PM >>> To: Fujinaka, Todd <todd.fujinaka@intel.com>; Paul Menzel >>> <pmenzel@molgen.mpg.de> >>> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >>> <gregkh@linuxfoundation.org> >>> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >>> >>> On 12/18/20 4:09 PM, Fujinaka, Todd wrote: >>>> What do you consider a regression? Having to enable 2.5G and 5G using ethtool which can be done at boot time? >>>> >>>> We had more than a few datacenters with issues because of competing standards. I checked with our marketing people and, on the whole, no one could think of a large number of 2.5G or 5G customers. >>>> >>>> We had several escalations from major OEMs and this was the solution they wanted. >>>> >>>> We consider this necessary for interoperability. >>> >>> Can you detect this case somehow and automatically fall-back to 1Gbps? >>> >>> For my own purposes, I will just hack that commit, but it is likely to be confusing to other people who had a system that worked at 2.5 previously and then suddenly it is slower. There is no easy way to know from the symptom that you need to dig up an obscure readme and run an obscure ethtool command. >>> >>> Thanks, >>> Ben >>> >>>> >>>> Todd Fujinaka >>>> Software Application Engineer >>>> Data Center Group >>>> Intel Corporation >>>> todd.fujinaka at intel.com >>>> >>>> -----Original Message----- >>>> From: Paul Menzel <pmenzel@molgen.mpg.de> >>>> Sent: Friday, December 18, 2020 3:19 PM >>>> To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd >>>> <todd.fujinaka@intel.com> >>>> Cc: intel-wired-lan at lists.osuosl.org; Greg KH >>>> <gregkh@linuxfoundation.org>; Nguyen, Anthony L >>>> <anthony.l.nguyen@intel.com>; Brandeburg, Jesse >>>> <jesse.brandeburg@intel.com>; Tyl, RadoslawX >>>> <radoslawx.tyl@intel.com>; Loktionov, Aleksandr >>>> <aleksandr.loktionov@intel.com>; Mclean, Arthur F >>>> <arthur.f.mclean@intel.com>; Skajewski, PiotrX >>>> <piotrx.skajewski@intel.com> >>>> Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? >>>> >>>> [+cc Radoslaw, Aleksandr, Piotr] >>>> >>>> Am 19.12.20 um 00:07 schrieb Ben Greear: >>>>> On 12/18/20 11:43 AM, Paul Menzel wrote: >>>> >>>>>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>>>>> Yes, and I'm plugging the hole in the README right now. Here's >>>>>>> the proposed text: >>>>>>> >>>>>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>>>>> default due to interoperability issues with certain switches. To >>>>>>> turn them back on, use >>>>>>> >>>>>>> ethtool -s <ethX> advertise N >>>>>>> >>>>>>> where N is a combination of the following. >>>>>>> >>>>>>> 100baseTFull??? 0x008 >>>>>>> 1000baseTFull?? 0x020 >>>>>>> 2500baseTFull?? 0x800000000000 >>>>>>> 5000baseTFull?? 0x1000000000000 >>>>>>> 10000baseTFull? 0x1000 >>>>>>> >>>>>>> For example, to turn on all modes: >>>>>>> ethtool -s <ethX> advertise 0x1800000001028 >>>>>>> >>>>>>> For more details please see the ethtool man page. >>>>>> >>>>>> What commit introduced this regression. Please bear in mind, that >>>>>> this contradicts Linux? no-regression policy, and the commit >>>>>> should therefore be reverted as soon as possible. >>>>> >>>>> Looks like it is at the end of this patch, though the description >>>>> doesn't mention changing defaults: >>>>> >>>>> Commit a296d665eae1e8ec6445683bfb999c884058426a >>>>> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >>>>> Date:?? Fri Jun 26 15:28:14 2020 +0200 >>>>> >>>>> ??? ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps >>>>> support >>>>> >>>>> ??? Added full support for new version Ethtool API. New API allow use >>>>> ??? 2500Gbase-T and 5000base-T supported and advertised link speed modes. >>>>> >>>>> ??? Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >>>>> ??? Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >>>>> ??? Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >>>>> >>>>> Thanks, >>>>> Ben >>> >>> >>> -- >>> Ben Greear <greearb@candelatech.com> >>> Candela Technologies Inc http://www.candelatech.com >>> >> >> >> -- >> Ben Greear <greearb@candelatech.com> >> Candela Technologies Inc http://www.candelatech.com >> > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-21 15:52 ` Ben Greear 2020-12-21 15:58 ` Fujinaka, Todd @ 2020-12-22 8:59 ` Greg KH 1 sibling, 0 replies; 22+ messages in thread From: Greg KH @ 2020-12-22 8:59 UTC (permalink / raw) To: intel-wired-lan On Mon, Dec 21, 2020 at 07:52:54AM -0800, Ben Greear wrote: > On 12/21/20 7:20 AM, Fujinaka, Todd wrote: > > Nope. The timing of the PHYs means the switch times out while we're trying 2.5G and 5G and the switch goes to its default lowest speed of 1G. Then we go to 1G and by that time bonding is broken in several of the cases we ran into. > > > > Basically, we can have that switch work, or we can have 2.5G and 5G on by default. Not both. And since we're selling a 10G device with other speeds as a bonus, we're prioritizing the highest speed. That plus the very high profile customers who wanted this solution. > > > > The solution for one camp or the other is to use the ethtool command at boot (I've forgotten exactly what that was) but the high profile customers refused to do that. Sounds like you're refusing as well? > > I'm not refusing, I just would rather patch my kernels than use ethtool, that way my older user-space > would work fine on newer kernels. > > Would you accept a patch that makes this a module option, defaulted to disable 2.5/5, but which > a user could enabled to enable 2.5/5 by default? Module options are not ok, this is not the 1990's. Please use the proper configuration methods instead that can work on a per-device basis. thanks, greg k-h ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 16:19 ` Fujinaka, Todd 2020-12-19 16:48 ` Ben Greear @ 2020-12-21 15:09 ` Paul Menzel 2020-12-21 15:16 ` Fujinaka, Todd 1 sibling, 1 reply; 22+ messages in thread From: Paul Menzel @ 2020-12-21 15:09 UTC (permalink / raw) To: intel-wired-lan Dear Todd, I kindly ask you again, please do not top-post. It?s impolite, and more importantly, it wastes the readers time as it looses context, and results in misunderstandings. Am 19.12.20 um 17:19 schrieb Fujinaka, Todd: > This is a bad case with no ideal solution. Detecting the case is not > possible as autonegotiation happens in the hardware without software > involvement. > > One solution was to update the switch firmware for the a switch that > is is the link partner that give us the most trouble. The issue > appears to be in competing or half-implemented standards. 2.5G and 5G > were initially non-IEEE standards that different manufacturers hacked > onto 1G in different ways. We implemented it to one of the standards > which should be interoperable, but the corner case of the > widely-deployed switch will take the link from 10G to 1G with no > automated way to fix it. Thank you for the background, which should have been in the commit message. Can you please tell us the problematic switch name and the problematic firmware version and the one, where this issues is fixed? > Updating switches means a lot of downtime for a lot of datacenters > and the OEMs we deal with would not accept that answer. Well, then please discuss the problem and possible solutions on the mailing list. Breaking other peoples setups is unacceptable. A Linux kernel runtime parameter would be one solution, your customers could have used. > Our solution was to disable 2.5G and 5G by default. This fixes 10G > linking at 1G on that switch, but 2.5G and 5G will link at 1G by > default. And, as I said, I've had very little contact with people > using 2.5G and 5G and I'm the guy on all the mailing lists. Unfortunately, a lot of users are not on the mailing list. > I apologize for making your life harder, but it seems like it's just > you so far. Paul seems to be arguing with me just for the fun of it. Please keep the discussion respectful, and do not insult others. Unfortunately, at work we have now been bitten several times by regressions updating to the current mainline Linux kernel, causing frictions in the team about what Linux kernel to use. I am missing a statement by you, acknowledging that the commit and the whole communication was a big fail, and how you will fix the regression. Additionally, an analysis would be nice, where the process failed ? why was the commit message incomplete and why did the test (Tested-by present) not spot the issue ? and how to improve it to avoid such a situation in the future. Kind regards, Paul ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-21 15:09 ` [Intel-wired-lan] ixgbe: " Paul Menzel @ 2020-12-21 15:16 ` Fujinaka, Todd 2020-12-21 15:31 ` Hisashi T Fujinaka 0 siblings, 1 reply; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-21 15:16 UTC (permalink / raw) To: intel-wired-lan I would listen to you on Linus' list, but this is Intel-wired-lan. Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Paul Menzel <pmenzel@molgen.mpg.de> Sent: Monday, December 21, 2020 7:10 AM To: Fujinaka, Todd <todd.fujinaka@intel.com>; Ben Greear <greearb@candelatech.com> Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org>; Linus Torvalds <torvalds@linux-foundation.org>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com> Subject: Re: [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? Dear Todd, I kindly ask you again, please do not top-post. It?s impolite, and more importantly, it wastes the readers time as it looses context, and results in misunderstandings. Am 19.12.20 um 17:19 schrieb Fujinaka, Todd: > This is a bad case with no ideal solution. Detecting the case is not > possible as autonegotiation happens in the hardware without software > involvement. > > One solution was to update the switch firmware for the a switch that > is is the link partner that give us the most trouble. The issue > appears to be in competing or half-implemented standards. 2.5G and 5G > were initially non-IEEE standards that different manufacturers hacked > onto 1G in different ways. We implemented it to one of the standards > which should be interoperable, but the corner case of the > widely-deployed switch will take the link from 10G to 1G with no > automated way to fix it. Thank you for the background, which should have been in the commit message. Can you please tell us the problematic switch name and the problematic firmware version and the one, where this issues is fixed? > Updating switches means a lot of downtime for a lot of datacenters and > the OEMs we deal with would not accept that answer. Well, then please discuss the problem and possible solutions on the mailing list. Breaking other peoples setups is unacceptable. A Linux kernel runtime parameter would be one solution, your customers could have used. > Our solution was to disable 2.5G and 5G by default. This fixes 10G > linking at 1G on that switch, but 2.5G and 5G will link at 1G by > default. And, as I said, I've had very little contact with people > using 2.5G and 5G and I'm the guy on all the mailing lists. Unfortunately, a lot of users are not on the mailing list. > I apologize for making your life harder, but it seems like it's just > you so far. Paul seems to be arguing with me just for the fun of it. Please keep the discussion respectful, and do not insult others. Unfortunately, at work we have now been bitten several times by regressions updating to the current mainline Linux kernel, causing frictions in the team about what Linux kernel to use. I am missing a statement by you, acknowledging that the commit and the whole communication was a big fail, and how you will fix the regression. Additionally, an analysis would be nice, where the process failed ? why was the commit message incomplete and why did the test (Tested-by present) not spot the issue ? and how to improve it to avoid such a situation in the future. Kind regards, Paul ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-21 15:16 ` Fujinaka, Todd @ 2020-12-21 15:31 ` Hisashi T Fujinaka 0 siblings, 0 replies; 22+ messages in thread From: Hisashi T Fujinaka @ 2020-12-21 15:31 UTC (permalink / raw) To: intel-wired-lan I'm going to answer this from home, where Outlook isn't impeding me. This is the only time I'm doing this because I can't find your email any more. Outlook has cleverly hiddent it from me. On Mon, 21 Dec 2020, Fujinaka, Todd wrote: > I would listen to you on Linus' list, but this is Intel-wired-lan. > > Todd Fujinaka > Software Application Engineer > Data Center Group > Intel Corporation > todd.fujinaka at intel.com > > -----Original Message----- > From: Paul Menzel <pmenzel@molgen.mpg.de> > Sent: Monday, December 21, 2020 7:10 AM > To: Fujinaka, Todd <todd.fujinaka@intel.com>; Ben Greear <greearb@candelatech.com> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org>; Linus Torvalds <torvalds@linux-foundation.org>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com> > Subject: Re: [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? > > Dear Todd, > > > I kindly ask you again, please do not top-post. It?s impolite, and more importantly, it wastes the readers time as it looses context, and results in misunderstandings. This is where I should've inserted my comment about Outlook and intel-wired-lan vs Linus' lists. It's a pain, but > Am 19.12.20 um 17:19 schrieb Fujinaka, Todd: >> This is a bad case with no ideal solution. Detecting the case is not >> possible as autonegotiation happens in the hardware without software >> involvement. >> >> One solution was to update the switch firmware for the a switch that >> is is the link partner that give us the most trouble. The issue >> appears to be in competing or half-implemented standards. 2.5G and 5G >> were initially non-IEEE standards that different manufacturers hacked >> onto 1G in different ways. We implemented it to one of the standards >> which should be interoperable, but the corner case of the >> widely-deployed switch will take the link from 10G to 1G with no >> automated way to fix it. > > Thank you for the background, which should have been in the commit message. > > Can you please tell us the problematic switch name and the problematic firmware version and the one, where this issues is fixed? I can ask around. I wasn't on those issues. The problem isn't with the switch manufacturer because they're released a fix, but with the datacenters who don't want to update their switches. I've been loath to reveal more data because that's confidential to the customer. >> Updating switches means a lot of downtime for a lot of datacenters and >> the OEMs we deal with would not accept that answer. > > Well, then please discuss the problem and possible solutions on the mailing list. Breaking other peoples setups is unacceptable. A Linux kernel runtime parameter would be one solution, your customers could have used. Runtime parameter? That's even higher on the list of "not allowed". I've said several times that the end-customers wouldn't update their switches and wouldn't use any boot parameters. Customers high enough that the executive VP of several companies called our company and demanded an immediate fix. >> Our solution was to disable 2.5G and 5G by default. This fixes 10G >> linking at 1G on that switch, but 2.5G and 5G will link at 1G by >> default. And, as I said, I've had very little contact with people >> using 2.5G and 5G and I'm the guy on all the mailing lists. > > Unfortunately, a lot of users are not on the mailing list. On ANY mailing list. This isn't the only one I'm on. >> I apologize for making your life harder, but it seems like it's just >> you so far. Paul seems to be arguing with me just for the fun of it. > > Please keep the discussion respectful, and do not insult others. I'm not being disrespectful, I'm just saying you're just arguing semantics and "rules". > Unfortunately, at work we have now been bitten several times by regressions updating to the current mainline Linux kernel, causing frictions in the team about what Linux kernel to use. > > I am missing a statement by you, acknowledging that the commit and the whole communication was a big fail, and how you will fix the regression. > Additionally, an analysis would be nice, where the process failed ? why was the commit message incomplete and why did the test (Tested-by > present) not spot the issue ? and how to improve it to avoid such a situation in the future. Communications was a big fail, and I'm here to try to solve that. We will not be reverting this, in fact I've been told by my management that this is required. And my management goes way up the chain to executive VPs at Intel. Right now I'm between a rock and a hard place and 2.5G and 5G is not our primary market. I'm not the marketing guy, so I didn't make that decision. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 0:09 ` Fujinaka, Todd 2020-12-19 0:47 ` Ben Greear @ 2020-12-19 7:54 ` Paul Menzel 2020-12-19 16:07 ` Fujinaka, Todd 1 sibling, 1 reply; 22+ messages in thread From: Paul Menzel @ 2020-12-19 7:54 UTC (permalink / raw) To: intel-wired-lan Dear Todd, Thank you for your reply. What is the reason you stripped the maintainers from Cc list again? Also, please adhere to mailing list etiquette, and do not top post, but use interleaved style. For context: Commit a296d665ea (ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support), present since 5.9-rc1, introduced a regression, that link negotiation now defaults to 1 Gbps, and ethtool has to be run to enable higher speeds 2.5. Am 19.12.20 um 01:09 schrieb Fujinaka, Todd: > What do you consider a regression? Having to enable 2.5G and 5G using > ethtool which can be done at boot time? Well, Linux? no regression policy should be well known by Linux kernel developers and maintainers. People can always update to the mainline Linux kernel, and expect their setup to work as with the old Linux kernel. Even if the behavior before was a bug. But maybe I am wrong, so Linus is in the Cc list now. > We had more than a few datacenters with issues because of competing > standards. I checked with our marketing people and, on the whole, no > one could think of a large number of 2.5G or 5G customers. > > We had several escalations from major OEMs and this was the solution > they wanted. > > We consider this necessary for interoperability. As written, this does not matter, as far as I know. You have to find a way to not regress working setups. It also shows, that your process should be more open. In this case, I am particularly upset, that the commit changed the defaults without any mentioning in the commit message, and the commit message misses all the information and context, which now took a while to gather from you. Additionally, in my opinion, additionally, a warning or notice should be printed by Linux about this issue. Kind regards, Paul > -----Original Message----- > From: Paul Menzel <pmenzel@molgen.mpg.de> Sent: Friday, December 18, 2020 3:19 PM > To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd <todd.fujinaka@intel.com> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Tyl, RadoslawX <radoslawx.tyl@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Mclean, Arthur F <arthur.f.mclean@intel.com>; Skajewski, PiotrX <piotrx.skajewski@intel.com> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > [+cc Radoslaw, Aleksandr, Piotr] > > Am 19.12.20 um 00:07 schrieb Ben Greear:> On 12/18/20 11:43 AM, Paul Menzel wrote: > >>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>> proposed text: >>>> >>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>> default due to interoperability issues with certain switches. To >>>> turn them back on, use >>>> >>>> ethtool -s <ethX> advertise N >>>> >>>> where N is a combination of the following. >>>> >>>> 100baseTFull 0x008 >>>> 1000baseTFull 0x020 >>>> 2500baseTFull 0x800000000000 >>>> 5000baseTFull 0x1000000000000 >>>> 10000baseTFull 0x1000 >>>> >>>> For example, to turn on all modes: >>>> ethtool -s <ethX> advertise 0x1800000001028 >>>> >>>> For more details please see the ethtool man page. >>> >>> What commit introduced this regression. Please bear in mind, that >>> this contradicts Linux? no-regression policy, and the commit should >>> therefore be reverted as soon as possible. >> >> Looks like it is at the end of this patch, though the description >> doesn't mention changing defaults: >> >> Commit a296d665eae1e8ec6445683bfb999c884058426a >> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >> Date: Fri Jun 26 15:28:14 2020 +0200 >> >> ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >> >> Added full support for new version Ethtool API. New API allow use >> 2500Gbase-T and 5000base-T supported and advertised link speed modes. >> >> Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >> >> Thanks, >> Ben ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 7:54 ` Paul Menzel @ 2020-12-19 16:07 ` Fujinaka, Todd 2020-12-19 20:59 ` Linus Torvalds 0 siblings, 1 reply; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-19 16:07 UTC (permalink / raw) To: intel-wired-lan Well, actually, we considered defeaturing 2.5G and 5G in hardware. Would that have been better for you? I am stripping the maintainers because they're on the mailing list already and multiple copies of the same email means they get annoyed and likely ignore the email. And if you do put it back, we will likely have to disable the feature in hardware and have variants that allow it for the people who need it. You really want that outcome? Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Paul Menzel <pmenzel@molgen.mpg.de> Sent: Friday, December 18, 2020 11:55 PM To: Fujinaka, Todd <todd.fujinaka@intel.com>; Ben Greear <greearb@candelatech.com> Cc: Greg KH <gregkh@linuxfoundation.org>; intel-wired-lan at lists.osuosl.org; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>; Linus Torvalds <torvalds@linux-foundation.org>; Tyl, RadoslawX <radoslawx.tyl@intel.com> Subject: Re: [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? Dear Todd, Thank you for your reply. What is the reason you stripped the maintainers from Cc list again? Also, please adhere to mailing list etiquette, and do not top post, but use interleaved style. For context: Commit a296d665ea (ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support), present since 5.9-rc1, introduced a regression, that link negotiation now defaults to 1 Gbps, and ethtool has to be run to enable higher speeds 2.5. Am 19.12.20 um 01:09 schrieb Fujinaka, Todd: > What do you consider a regression? Having to enable 2.5G and 5G using > ethtool which can be done at boot time? Well, Linux? no regression policy should be well known by Linux kernel developers and maintainers. People can always update to the mainline Linux kernel, and expect their setup to work as with the old Linux kernel. Even if the behavior before was a bug. But maybe I am wrong, so Linus is in the Cc list now. > We had more than a few datacenters with issues because of competing > standards. I checked with our marketing people and, on the whole, no > one could think of a large number of 2.5G or 5G customers. > > We had several escalations from major OEMs and this was the solution > they wanted. > > We consider this necessary for interoperability. As written, this does not matter, as far as I know. You have to find a way to not regress working setups. It also shows, that your process should be more open. In this case, I am particularly upset, that the commit changed the defaults without any mentioning in the commit message, and the commit message misses all the information and context, which now took a while to gather from you. Additionally, in my opinion, additionally, a warning or notice should be printed by Linux about this issue. Kind regards, Paul > -----Original Message----- > From: Paul Menzel <pmenzel@molgen.mpg.de> Sent: Friday, December 18, 2020 3:19 PM > To: Ben Greear <greearb@candelatech.com>; Fujinaka, Todd <todd.fujinaka@intel.com> > Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh@linuxfoundation.org>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Tyl, RadoslawX <radoslawx.tyl@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Mclean, Arthur F <arthur.f.mclean@intel.com>; Skajewski, PiotrX <piotrx.skajewski@intel.com> > Subject: Re: [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? > > [+cc Radoslaw, Aleksandr, Piotr] > > Am 19.12.20 um 00:07 schrieb Ben Greear:> On 12/18/20 11:43 AM, Paul Menzel wrote: > >>> Am 18.12.20 um 20:27 schrieb Fujinaka, Todd: >>>> Yes, and I'm plugging the hole in the README right now. Here's the >>>> proposed text: >>>> >>>> Advertisements for 2.5G and 5G on the x550 were turned off by >>>> default due to interoperability issues with certain switches. To >>>> turn them back on, use >>>> >>>> ethtool -s <ethX> advertise N >>>> >>>> where N is a combination of the following. >>>> >>>> 100baseTFull 0x008 >>>> 1000baseTFull 0x020 >>>> 2500baseTFull 0x800000000000 >>>> 5000baseTFull 0x1000000000000 >>>> 10000baseTFull 0x1000 >>>> >>>> For example, to turn on all modes: >>>> ethtool -s <ethX> advertise 0x1800000001028 >>>> >>>> For more details please see the ethtool man page. >>> >>> What commit introduced this regression. Please bear in mind, that >>> this contradicts Linux? no-regression policy, and the commit should >>> therefore be reverted as soon as possible. >> >> Looks like it is at the end of this patch, though the description >> doesn't mention changing defaults: >> >> Commit a296d665eae1e8ec6445683bfb999c884058426a >> Author: Radoslaw Tyl <radoslawx.tyl@intel.com> >> Date: Fri Jun 26 15:28:14 2020 +0200 >> >> ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support >> >> Added full support for new version Ethtool API. New API allow use >> 2500Gbase-T and 5000base-T supported and advertised link speed modes. >> >> Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> >> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> >> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >> >> Thanks, >> Ben ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 16:07 ` Fujinaka, Todd @ 2020-12-19 20:59 ` Linus Torvalds 2020-12-21 15:12 ` Fujinaka, Todd 0 siblings, 1 reply; 22+ messages in thread From: Linus Torvalds @ 2020-12-19 20:59 UTC (permalink / raw) To: intel-wired-lan On Sat, Dec 19, 2020 at 8:07 AM Fujinaka, Todd <todd.fujinaka@intel.com> wrote: > > I am stripping the maintainers because they're on the mailing list already and multiple copies of the same email means they get annoyed and likely ignore the email. Don't do that. This is what "Message-ID" is for. People don't get multiple copies of the email, because any sane MUA will see that they are the same message even when they came in through two different paths. Of course, if some maintainer has a broken MUA, that's one thing, but I would expect kernel maintainers that deal with a lot of email to not have quite _that_ broken a setup. Linus ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? 2020-12-19 20:59 ` Linus Torvalds @ 2020-12-21 15:12 ` Fujinaka, Todd 0 siblings, 0 replies; 22+ messages in thread From: Fujinaka, Todd @ 2020-12-21 15:12 UTC (permalink / raw) To: intel-wired-lan I apologize, it's Outlook 365 here at Intel though, which means threading is broken and I?m going to get OOO messages from everyone but me because it's the holidays. They're all on the mailing list and they don't fix bugs unless they get pestered anyway, but I'll follow your rules. Todd Fujinaka Software Application Engineer Data Center Group Intel Corporation todd.fujinaka at intel.com -----Original Message----- From: Linus Torvalds <torvalds@linux-foundation.org> Sent: Saturday, December 19, 2020 12:59 PM To: Fujinaka, Todd <todd.fujinaka@intel.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de>; Ben Greear <greearb@candelatech.com>; Greg KH <gregkh@linuxfoundation.org>; intel-wired-lan at lists.osuosl.org; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>; Tyl, RadoslawX <radoslawx.tyl@intel.com> Subject: Re: [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation? On Sat, Dec 19, 2020 at 8:07 AM Fujinaka, Todd <todd.fujinaka@intel.com> wrote: > > I am stripping the maintainers because they're on the mailing list already and multiple copies of the same email means they get annoyed and likely ignore the email. Don't do that. This is what "Message-ID" is for. People don't get multiple copies of the email, because any sane MUA will see that they are the same message even when they came in through two different paths. Of course, if some maintainer has a broken MUA, that's one thing, but I would expect kernel maintainers that deal with a lot of email to not have quite _that_ broken a setup. Linus ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2020-12-22 8:59 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-18 18:08 [Intel-wired-lan] 5.10.0 kernel regression for 2.5Gbps link negotiation? Ben Greear 2020-12-18 19:27 ` Fujinaka, Todd 2020-12-18 19:41 ` Ben Greear 2020-12-18 19:43 ` Paul Menzel 2020-12-18 23:07 ` Ben Greear 2020-12-18 23:19 ` Paul Menzel 2020-12-19 0:09 ` Fujinaka, Todd 2020-12-19 0:47 ` Ben Greear 2020-12-19 16:19 ` Fujinaka, Todd 2020-12-19 16:48 ` Ben Greear 2020-12-21 15:20 ` Fujinaka, Todd 2020-12-21 15:52 ` Ben Greear 2020-12-21 15:58 ` Fujinaka, Todd 2020-12-21 16:04 ` Ben Greear 2020-12-22 8:59 ` Greg KH 2020-12-21 15:09 ` [Intel-wired-lan] ixgbe: " Paul Menzel 2020-12-21 15:16 ` Fujinaka, Todd 2020-12-21 15:31 ` Hisashi T Fujinaka 2020-12-19 7:54 ` Paul Menzel 2020-12-19 16:07 ` Fujinaka, Todd 2020-12-19 20:59 ` Linus Torvalds 2020-12-21 15:12 ` Fujinaka, Todd
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.