All of lore.kernel.org
 help / color / mirror / Atom feed
* Help on PHY not supporting autoneg
@ 2022-11-30 12:16 Piergiorgio Beruto
  2022-11-30 15:32 ` Andrew Lunn
  0 siblings, 1 reply; 5+ messages in thread
From: Piergiorgio Beruto @ 2022-11-30 12:16 UTC (permalink / raw)
  To: netdev

Hello netdev group,
I am looking for someone willing to help me on a problem I encountered
while trying to implement a driver for a PHY that does not support
autoneg. On my reference HW, the PHY is connected to a stmmac via MII.

This is what I did in the driver:
- implement get_features to report NO AN support, and the supported link
  modes.
- implement the IRQ handling (config_intr and handle_interrupt) to
  report the link status

The first problem I encountered is: how to start the PHY? The device
requires a bit (LINK_CONTROL) to be enabled before trying to bring up
the link. But I could not find any obvious callback from phylib to
actually do this. Eventually, I opted for implementing the
suspend/resume callbacks and set that bit in there. Is that right? Any
better option?

With that said, the thing still does not work as I would expect. When I
ifconfig up my device, here's what happens (the ncn_* prints are just
debug prinktks I've added to show the problem). See also my comments in
the log marked with //

/root # ifconfig eth0 up
[   26.950557] socfpga-dwmac ff700000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
[   26.962673] ncn_soft_reset
[   26.966345] ncn_config_init
[   26.969168] ncn_config_intr
[   26.972211] disable IRQs   // OK, this is expected, phylib is resetting the device
[   26.975062] ncn_resume     // not sure I expected this to be here, but it does not harm
[   26.977746] socfpga-dwmac ff700000.ethernet eth0: PHY [stmmac-0:08] driver [NCN26000] (irq=49)
[   26.986861] ncn_config_intr
[   26.990045] ncn_config_intr ret = 8000, irqs = 0001
[   26.994958] ncn_handle_interrupt 8000
[   26.998941] ncn_handle_interrupt 8001
[   27.002752] link = 1, ret = 0829       // there I get a link UP!
[   27.016526] dwmac1000: Master AXI performs any burst length
[   27.022128] socfpga-dwmac ff700000.ethernet eth0: No Safety Features support found
[   27.029999] socfpga-dwmac ff700000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[   27.039425] socfpga-dwmac ff700000.ethernet eth0: registered PTP clock
[   27.049388] socfpga-dwmac ff700000.ethernet eth0: configuring for phy/mii link mode
[   27.057155] ncn_resume  // I don't fully understand what happened since the link up, but it seems the MAC is doing more initialization
[   27.061251] ncn_handle_interrupt 8001
[   27.065100] link = 0, ret = 0809 // here I get a link down (???)

From there on, if I read the LINK_CONTROL bit, someone set it to 0 (???)
This bit lies in the control register (Clause 22, address 0).

/root # mdio-tool -i eth0 -r -a 0
clause 22 register @eth0 0x0000 --> 0x0000   // ????? That explains the link down, though

If I manually set the LINK_CONTROL bit, then I magically get the link up

/root # mdio-tool -i eth0 -w -a 0 0x1000
[   81.276504] ncn_handle_interrupt 8001
[   81.280345] link = 1, ret = 0829
clause 22 register @eth0 0x0000 <-- 0x1000[   81.284442] socfpga-dwmac ff700000.ethernet eth0: No phy led trigger registered for speed(10)

/root # [   81.297690] socfpga-dwmac ff700000.ethernet eth0: Link is Up - 10Mbps/Half - flow control off

I've also tried a completely different approach, that is "emulating"
autoneg by implementing the config_aneg, aneg_done and read_status
callbacks. If I do this, then my driver works just fine and nobody seems
to be overwriting register 0.

Any clue on what I might be doing wrong here?

Thank you in advance for any help you might provide.
Regards,
Piergiorgio


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help on PHY not supporting autoneg
  2022-11-30 12:16 Help on PHY not supporting autoneg Piergiorgio Beruto
@ 2022-11-30 15:32 ` Andrew Lunn
  2022-11-30 22:59   ` Piergiorgio Beruto
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Lunn @ 2022-11-30 15:32 UTC (permalink / raw)
  To: Piergiorgio Beruto; +Cc: netdev

On Wed, Nov 30, 2022 at 01:16:02PM +0100, Piergiorgio Beruto wrote:
> Hello netdev group,
> I am looking for someone willing to help me on a problem I encountered
> while trying to implement a driver for a PHY that does not support
> autoneg. On my reference HW, the PHY is connected to a stmmac via MII.
> 
> This is what I did in the driver:
> - implement get_features to report NO AN support, and the supported link
>   modes.
> - implement the IRQ handling (config_intr and handle_interrupt) to
>   report the link status
> 
> The first problem I encountered is: how to start the PHY? The device
> requires a bit (LINK_CONTROL) to be enabled before trying to bring up
> the link. But I could not find any obvious callback from phylib to
> actually do this. Eventually, I opted for implementing the
> suspend/resume callbacks and set that bit in there. Is that right? Any
> better option?

The MAC driver should call phy_start() in its open function.  That
kicks off the state machine. That results in phy_start_aneg(),
_phy_start_aneg(), phy_config_aneg(), which calls the PHY drivers
config_aneg() method. It does not matter here that the PHY does not
actually support AN, this method still gets called. The driver can
then look at the phydev members and configure the hardware as needed.
 
> With that said, the thing still does not work as I would expect. When I
> ifconfig up my device, here's what happens (the ncn_* prints are just
> debug prinktks I've added to show the problem). See also my comments in
> the log marked with //
> 
> /root # ifconfig eth0 up
> [   26.950557] socfpga-dwmac ff700000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
> [   26.962673] ncn_soft_reset
> [   26.966345] ncn_config_init
> [   26.969168] ncn_config_intr
> [   26.972211] disable IRQs   // OK, this is expected, phylib is resetting the device
> [   26.975062] ncn_resume     // not sure I expected this to be here, but it does not harm
> [   26.977746] socfpga-dwmac ff700000.ethernet eth0: PHY [stmmac-0:08] driver [NCN26000] (irq=49)
> [   26.986861] ncn_config_intr
> [   26.990045] ncn_config_intr ret = 8000, irqs = 0001
> [   26.994958] ncn_handle_interrupt 8000
> [   26.998941] ncn_handle_interrupt 8001
> [   27.002752] link = 1, ret = 0829       // there I get a link UP!
> [   27.016526] dwmac1000: Master AXI performs any burst length
> [   27.022128] socfpga-dwmac ff700000.ethernet eth0: No Safety Features support found
> [   27.029999] socfpga-dwmac ff700000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
> [   27.039425] socfpga-dwmac ff700000.ethernet eth0: registered PTP clock
> [   27.049388] socfpga-dwmac ff700000.ethernet eth0: configuring for phy/mii link mode
> [   27.057155] ncn_resume  // I don't fully understand what happened since the link up, but it seems the MAC is doing more initialization

This looks odd. I would not expect a resume here. Add a WARN_ON(1) to
get a stack trace and see if that helps explain why. My guess would
be, you somehow end up in socfpga_dwmac_resume().

> [   27.061251] ncn_handle_interrupt 8001
> [   27.065100] link = 0, ret = 0809 // here I get a link down (???)
> 
> >From there on, if I read the LINK_CONTROL bit, someone set it to 0 (???)

You can add a WARN_ON(regnum==0) in phy_write() to get a stack trace.

> I've also tried a completely different approach, that is "emulating"
> autoneg by implementing the config_aneg, aneg_done and read_status
> callbacks. If I do this, then my driver works just fine and nobody seems
> to be overwriting register 0.

That is not emulating. The config_aneg() just has a bad name.

If you have not provided a config_aneg() the default implementation is
used, __genphy_config_aneg(), which will call genphy_setup_forced():

int genphy_setup_forced(struct phy_device *phydev)
{
	u16 ctl;

	phydev->pause = 0;
	phydev->asym_pause = 0;

	ctl = mii_bmcr_encode_fixed(phydev->speed, phydev->duplex);

	return phy_modify(phydev, MII_BMCR,
			  ~(BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN), ctl);

What exactly is LINK_CONTROL. It is not one of the Linux names for a
bit in BMCR.

    Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help on PHY not supporting autoneg
  2022-11-30 15:32 ` Andrew Lunn
@ 2022-11-30 22:59   ` Piergiorgio Beruto
  2022-12-01  2:14     ` Andrew Lunn
  0 siblings, 1 reply; 5+ messages in thread
From: Piergiorgio Beruto @ 2022-11-30 22:59 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev

> > [   27.049388] socfpga-dwmac ff700000.ethernet eth0: configuring for phy/mii link mode
> > [   27.057155] ncn_resume  // I don't fully understand what happened since the link up, but it seems the MAC is doing more initialization
> 
> This looks odd. I would not expect a resume here. Add a WARN_ON(1) to
> get a stack trace and see if that helps explain why. My guess would
> be, you somehow end up in socfpga_dwmac_resume().
Ok, I can see the MAC driver gets there. I'll try to debug this later
(not my focus at the moment).

> > [   27.061251] ncn_handle_interrupt 8001
> > [   27.065100] link = 0, ret = 0809 // here I get a link down (???)
> > 
> > >From there on, if I read the LINK_CONTROL bit, someone set it to 0 (???)
> 
> You can add a WARN_ON(regnum==0) in phy_write() to get a stack trace.
Indeed, it was genphy_setup_forced setting this bit to 0.

> 
> > I've also tried a completely different approach, that is "emulating"
> > autoneg by implementing the config_aneg, aneg_done and read_status
> > callbacks. If I do this, then my driver works just fine and nobody seems
> > to be overwriting register 0.
> 
> That is not emulating. The config_aneg() just has a bad name.
> 
> If you have not provided a config_aneg() the default implementation is
> used, __genphy_config_aneg(), which will call genphy_setup_forced():
> 
> int genphy_setup_forced(struct phy_device *phydev)
> {
> 	u16 ctl;
> 
> 	phydev->pause = 0;
> 	phydev->asym_pause = 0;
> 
> 	ctl = mii_bmcr_encode_fixed(phydev->speed, phydev->duplex);
> 
> 	return phy_modify(phydev, MII_BMCR,
> 			  ~(BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN), ctl);
> 
Ok, there is where my problem was. If I supply a config_aneg()
implementation that sets the bits I want, then everything works just
fine, and ethtool seems to report the correct information.

[   17.061733] dwmac1000: Master AXI performs any burst length
[   17.061766] socfpga-dwmac ff700000.ethernet eth0: No Safety Features support found
[   17.061799] socfpga-dwmac ff700000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[   17.062957] socfpga-dwmac ff700000.ethernet eth0: registered PTP clock
[   17.066366] socfpga-dwmac ff700000.ethernet eth0: configuring for phy/mii link mode
[   17.067170] socfpga-dwmac ff700000.ethernet eth0: No phy led trigger registered for speed(10)
[   17.067404] socfpga-dwmac ff700000.ethernet eth0: Link is Up - 10Mbps/Half - flow control off
/root #
/root # ethtool eth0
Settings for eth0:
        Supported ports: [ MII ]
        Supported link modes:   10baseT1S_P2MP/Half
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT1S_P2MP/Half
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10Mb/s
        Duplex: Half
        Auto-negotiation: off
        Port: Twisted Pair
        PHYAD: 8
        Transceiver: external
        MDI-X: off (auto)
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x0000003f (63)
                               drv probe link timer ifdown ifup
        Link detected: yes


> What exactly is LINK_CONTROL. It is not one of the Linux names for a
> bit in BMCR.
The 802.3cg standard define link_control as a varibale set by autoneg.
In factm it is tied to the BMCR_ANENABLE bit. The standard further
specifies that when AN is not supported, this bit can be supplied in a
vendor-specific way. A common thing to do is to just leave it tied to
the BMCR_ANENABLE bit.

So, the "problem" seems to lie in the genphy_setup_forced() function.
More precisely, where you pointed me at: 
>       return phy_modify(phydev, MII_BMCR,
>                         ~(BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN), ctl);
> 

In my view we have two choices to handle LINK_CONTROL.

1. Just let the PHY driver override the config_aneg() callback as I did.
This may be a bit counter-intuitive because of the function name, but it
works.

2. in phylib, distinguish the case of a PHY having aneg disabled from a
PHY NOT supporting aneg. In the latter case, don't touch the control
register and/or provide a separate callback for "starting" the PHY
(e.g., set_link_ctrl)

Thoughts?

Thank you again for your kind answer!
Piergiorgio

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help on PHY not supporting autoneg
  2022-11-30 22:59   ` Piergiorgio Beruto
@ 2022-12-01  2:14     ` Andrew Lunn
  2022-12-01  9:30       ` Piergiorgio Beruto
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Lunn @ 2022-12-01  2:14 UTC (permalink / raw)
  To: Piergiorgio Beruto; +Cc: netdev

> /root # ethtool eth0
> Settings for eth0:
>         Supported ports: [ MII ]
>         Supported link modes:   10baseT1S_P2MP/Half
>         Supported pause frame use: Symmetric Receive-only
>         Supports auto-negotiation: No
>         Supported FEC modes: Not reported
>         Advertised link modes:  10baseT1S_P2MP/Half
>         Advertised pause frame use: Symmetric Receive-only

That looks odd. The PHY should indicate if it supports pause
negotiation. Since this PHY does not support autoneg, it should not be
saying it can negotiate pause. So i'm wondering why it is saying this
here. Same for 'Supported pause'.

>         Advertised auto-negotiation: No
>         Advertised FEC modes: Not reported
>         Speed: 10Mb/s
>         Duplex: Half
>         Auto-negotiation: off
>         Port: Twisted Pair
>         PHYAD: 8
>         Transceiver: external
>         MDI-X: off (auto)

Given that this is a T1 PHY does MDI-X have any meaning? The (auto)
indicates the PHY is returning mdix_ctrl=ETH_TP_MDI_AUTO, when it
should be returning ETH_TP_MDI_INVALID to indicate it is not
supported.

>         Supports Wake-on: d
>         Wake-on: d
>         Current message level: 0x0000003f (63)
>                                drv probe link timer ifdown ifup
>         Link detected: yes
> 
> 
> > What exactly is LINK_CONTROL. It is not one of the Linux names for a
> > bit in BMCR.
> The 802.3cg standard define link_control as a varibale set by autoneg.
> In factm it is tied to the BMCR_ANENABLE bit. The standard further
> specifies that when AN is not supported, this bit can be supplied in a
> vendor-specific way. A common thing to do is to just leave it tied to
> the BMCR_ANENABLE bit.
> 
> So, the "problem" seems to lie in the genphy_setup_forced() function.
> More precisely, where you pointed me at: 
> >       return phy_modify(phydev, MII_BMCR,
> >                         ~(BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN), ctl);
> > 
> 
> In my view we have two choices to handle LINK_CONTROL.
> 
> 1. Just let the PHY driver override the config_aneg() callback as I did.
> This may be a bit counter-intuitive because of the function name, but it
> works.
> 
> 2. in phylib, distinguish the case of a PHY having aneg disabled from a
> PHY NOT supporting aneg. In the latter case, don't touch the control
> register and/or provide a separate callback for "starting" the PHY
> (e.g., set_link_ctrl)

2) sounds wrong. As you said, it is vendor-specific. As such you
cannot trust it to mean anything. The standard has left it undefined,
so you cannot do anything in generic code. I would also worry about
breaking older PHYs who have never had this bit set.

So i would go with 1). As i said, the function name is not ideal, but
it has been like this since forever.

   Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help on PHY not supporting autoneg
  2022-12-01  2:14     ` Andrew Lunn
@ 2022-12-01  9:30       ` Piergiorgio Beruto
  0 siblings, 0 replies; 5+ messages in thread
From: Piergiorgio Beruto @ 2022-12-01  9:30 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev

On Thu, Dec 01, 2022 at 03:14:51AM +0100, Andrew Lunn wrote:
> > /root # ethtool eth0
> > Settings for eth0:
> >         Supported ports: [ MII ]
> >         Supported link modes:   10baseT1S_P2MP/Half
> >         Supported pause frame use: Symmetric Receive-only
> >         Supports auto-negotiation: No
> >         Supported FEC modes: Not reported
> >         Advertised link modes:  10baseT1S_P2MP/Half
> >         Advertised pause frame use: Symmetric Receive-only
> 
> That looks odd. The PHY should indicate if it supports pause
> negotiation. Since this PHY does not support autoneg, it should not be
> saying it can negotiate pause. So i'm wondering why it is saying this
> here. Same for 'Supported pause'.
This is indeed a good question. This is the code snippet for the PHY
driver:

static int ncn26000_get_features(struct phy_device *phydev) {
	linkmode_zero(phydev->supported);
	linkmode_set_bit(ETHTOOL_LINK_MODE_MII_BIT, phydev->supported);

	linkmode_set_bit(
		ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT,
		phydev->supported
	);

	linkmode_copy(phydev->advertising, phydev->supported);
	return 0;
}

static int ncn26000_config_aneg(struct phy_device *phydev) {
	phydev->mdix_ctrl = ETH_TP_MDI_AUTO;
	phydev->mdix = ETH_TP_MDI;
	phydev->pause = 0;
	phydev->asym_pause = 0;
	phydev->speed = SPEED_10;
	phydev->duplex = DUPLEX_HALF;

	// bring up the link (link_ctrl is mapped to BMCR_ANENABLE)
	// clear also ISOLATE mode and Collision Test
	return phy_write(phydev, MII_BMCR, BMCR_ANENABLE);
}

As you can see there is no mention of PAUSE support in the features, and
the pause/asym_pause flags in phydev are set to 0.

Could it be a problem of the userspace tool?
Any advice on where to start looking?
 
> >         Advertised auto-negotiation: No
> >         Advertised FEC modes: Not reported
> >         Speed: 10Mb/s
> >         Duplex: Half
> >         Auto-negotiation: off
> >         Port: Twisted Pair
> >         PHYAD: 8
> >         Transceiver: external
> >         MDI-X: off (auto)
> 
> Given that this is a T1 PHY does MDI-X have any meaning? The (auto)
> indicates the PHY is returning mdix_ctrl=ETH_TP_MDI_AUTO, when it
> should be returning ETH_TP_MDI_INVALID to indicate it is not
> supported.
This is in fact debatable. The 10BASE-T1S has a unique feature making it
polarity insensitive. It's not like other xBASE-T1 PHYs that
auto-detects the polarity, the line coding (Differential Manchester) is
intrinsically polarity agnostic. Therefore I'm personally undecided on
the best approach.

On one hand, If the driver reports that MDI-X is not supported, the user
might think he needs to care about the polarity, which could be
misleading. On the other hand, there is no real auto-switch of TX/RX.

I'm not sure if/how 100BASE-T1 for example handles polarity, but we
probably need a common approach.

Thoughts?
 
> >         Supports Wake-on: d
> >         Wake-on: d
> >         Current message level: 0x0000003f (63)
> >                                drv probe link timer ifdown ifup
> >         Link detected: yes
> > 
> > 
> > > What exactly is LINK_CONTROL. It is not one of the Linux names for a
> > > bit in BMCR.
> > The 802.3cg standard define link_control as a varibale set by autoneg.
> > In factm it is tied to the BMCR_ANENABLE bit. The standard further
> > specifies that when AN is not supported, this bit can be supplied in a
> > vendor-specific way. A common thing to do is to just leave it tied to
> > the BMCR_ANENABLE bit.
> > 
> > So, the "problem" seems to lie in the genphy_setup_forced() function.
> > More precisely, where you pointed me at: 
> > >       return phy_modify(phydev, MII_BMCR,
> > >                         ~(BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN), ctl);
> > > 
> > 
> > In my view we have two choices to handle LINK_CONTROL.
> > 
> > 1. Just let the PHY driver override the config_aneg() callback as I did.
> > This may be a bit counter-intuitive because of the function name, but it
> > works.
> > 
> > 2. in phylib, distinguish the case of a PHY having aneg disabled from a
> > PHY NOT supporting aneg. In the latter case, don't touch the control
> > register and/or provide a separate callback for "starting" the PHY
> > (e.g., set_link_ctrl)
> 
> 2) sounds wrong. As you said, it is vendor-specific. As such you
> cannot trust it to mean anything. The standard has left it undefined,
> so you cannot do anything in generic code. I would also worry about
> breaking older PHYs who have never had this bit set.
Maybe I did not explain what I had in mind correctly.
I did not mean to set this bit, it is indeed vendor-specific.
The idea would be to define a new callback that by default does what we
have right now (clear all bits in register 0). But PHY drivers may
override it to actually do something meaningful, like setting the AN
bit.
> 
> So i would go with 1). As i said, the function name is not ideal, but
> it has been like this since forever.
I'm not strongly biased between 1 and 2, but I would first re-consider 2
before making a final decision.

Thank you very much for your kind reply.
Piergiorgio

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-01  9:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30 12:16 Help on PHY not supporting autoneg Piergiorgio Beruto
2022-11-30 15:32 ` Andrew Lunn
2022-11-30 22:59   ` Piergiorgio Beruto
2022-12-01  2:14     ` Andrew Lunn
2022-12-01  9:30       ` Piergiorgio Beruto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.