All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
@ 2016-10-27 22:05 Timur Tabi
  2016-10-27 22:12 ` Florian Fainelli
  2016-10-31 17:48 ` David Miller
  0 siblings, 2 replies; 10+ messages in thread
From: Timur Tabi @ 2016-10-27 22:05 UTC (permalink / raw)
  To: netdev, zefir.kurtisi, scampbel, alokc, shankerd, andrew, f.fainelli

The Atheros 8031 PHY supports the 802.3 extension for symmetric and
asymmetric pause frames, so set that to the list of features supported
by the phy.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
---

Without this patch, my NIC (the Qualcomm EMAC) receives a lot of frame
check sequence (aka CRC) errors, resulting in about 10% packet loss.
Can someone help me understand why?  Because of this patch, I can't use
the generic phy driver in phylib.  Why would a MAC controller require
its PHY to support pause frames?

 drivers/net/phy/at803x.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index a52b560..fb80413 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -440,7 +440,8 @@ static struct phy_driver at803x_driver[] = {
 	.get_wol		= at803x_get_wol,
 	.suspend		= at803x_suspend,
 	.resume			= at803x_resume,
-	.features		= PHY_GBIT_FEATURES,
+	.features		= PHY_GBIT_FEATURES |
+				  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
 	.flags			= PHY_HAS_INTERRUPT,
 	.config_aneg		= genphy_config_aneg,
 	.read_status		= genphy_read_status,
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-27 22:05 [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames Timur Tabi
@ 2016-10-27 22:12 ` Florian Fainelli
  2016-10-27 22:24   ` Timur Tabi
  2016-10-31 17:48 ` David Miller
  1 sibling, 1 reply; 10+ messages in thread
From: Florian Fainelli @ 2016-10-27 22:12 UTC (permalink / raw)
  To: Timur Tabi, netdev, zefir.kurtisi, scampbel, alokc, shankerd, andrew

On 10/27/2016 03:05 PM, Timur Tabi wrote:
> The Atheros 8031 PHY supports the 802.3 extension for symmetric and
> asymmetric pause frames, so set that to the list of features supported
> by the phy.
> 
> Signed-off-by: Timur Tabi <timur@codeaurora.org>
> ---
> 
> Without this patch, my NIC (the Qualcomm EMAC) receives a lot of frame
> check sequence (aka CRC) errors, resulting in about 10% packet loss.

Hu? In my experience that should not come from supporting Pause frames
or not, but rather properly configuring a (RG)MII delay, but your
mileage may vary.

> Can someone help me understand why?  Because of this patch, I can't use
> the generic phy driver in phylib.  Why would a MAC controller require
> its PHY to support pause frames?

It does not, support for Pause frames is a MAC-level feature, yet,
PHYLIB (and that's been on my todo for a while now) insists on reporting
the confusing phydev->pause and phydev->asym_pause, which really is what
has been determined from auto-negotiating with your partner, as opposed
to being a supported thing or not. The PHY has absolutely not business
in that.

Your change should probably be in the Ethernet MAC driver, when you have
successfully connected to the PHY, update phydev->supported to include
SUPPORTED_Pause | SUPPORTED_Asym_pause.

> 
>  drivers/net/phy/at803x.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> index a52b560..fb80413 100644
> --- a/drivers/net/phy/at803x.c
> +++ b/drivers/net/phy/at803x.c
> @@ -440,7 +440,8 @@ static struct phy_driver at803x_driver[] = {
>  	.get_wol		= at803x_get_wol,
>  	.suspend		= at803x_suspend,
>  	.resume			= at803x_resume,
> -	.features		= PHY_GBIT_FEATURES,
> +	.features		= PHY_GBIT_FEATURES |
> +				  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
>  	.flags			= PHY_HAS_INTERRUPT,
>  	.config_aneg		= genphy_config_aneg,
>  	.read_status		= genphy_read_status,
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-27 22:12 ` Florian Fainelli
@ 2016-10-27 22:24   ` Timur Tabi
  2016-10-27 22:39     ` Florian Fainelli
  0 siblings, 1 reply; 10+ messages in thread
From: Timur Tabi @ 2016-10-27 22:24 UTC (permalink / raw)
  To: Florian Fainelli, netdev, zefir.kurtisi, scampbel, alokc,
	shankerd, andrew

Florian Fainelli wrote:

> Hu? In my experience that should not come from supporting Pause frames
> or not, but rather properly configuring a (RG)MII delay, but your
> mileage may vary.

I can assure you, I'm more confused than you.  I've been working in this 
for almost two weeks, and not only does this patch align with other phy 
drivers, but it does fix my bug.  Without this patch, phylib does not 
set the pause frame bits (10 and 11) in MII_ADVERTISE.

> It does not, support for Pause frames is a MAC-level feature, yet,
> PHYLIB (and that's been on my todo for a while now) insists on reporting
> the confusing phydev->pause and phydev->asym_pause, which really is what
> has been determined from auto-negotiating with your partner, as opposed
> to being a supported thing or not. The PHY has absolutely not business
> in that.

But there are pause frame bits in the MII_ADVERTISE register, and this 
setting directly impacts whether those bits are set.  I don't see how 
this is a MAC-level feature.

> Your change should probably be in the Ethernet MAC driver, when you have
> successfully connected to the PHY, update phydev->supported to include
> SUPPORTED_Pause | SUPPORTED_Asym_pause.

The phylib documentation (networking/phy.txt) says:

  Now just make sure that phydev->supported and phydev->advertising have any
  values pruned from them which don't make sense for your controller (a 
10/100
  controller may be connected to a gigabit capable PHY, so you would need to
  mask off SUPPORTED_1000baseT*).  See include/linux/ethtool.h for 
definitions
  for these bitfields.

and then it says:

> Note that you should not SET any bits, or the PHY may
>  get put into an unsupported state.

So are you sure I'm supposed to set those bits?

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-27 22:24   ` Timur Tabi
@ 2016-10-27 22:39     ` Florian Fainelli
  2016-10-28 20:06       ` Timur Tabi
  0 siblings, 1 reply; 10+ messages in thread
From: Florian Fainelli @ 2016-10-27 22:39 UTC (permalink / raw)
  To: Timur Tabi, netdev, zefir.kurtisi, scampbel, alokc, shankerd, andrew

On 10/27/2016 03:24 PM, Timur Tabi wrote:
> Florian Fainelli wrote:
> 
>> Hu? In my experience that should not come from supporting Pause frames
>> or not, but rather properly configuring a (RG)MII delay, but your
>> mileage may vary.
> 
> I can assure you, I'm more confused than you.  I've been working in this
> for almost two weeks, and not only does this patch align with other phy
> drivers, but it does fix my bug.  Without this patch, phylib does not
> set the pause frame bits (10 and 11) in MII_ADVERTISE.

And that's expected, but if your MAC does not act upon phydev->pause and
phydev->asym_pause, then chances are that you can indeed lose packets
every once in a while.

The part that is totally crappy about Pause frames and PHYLIB is that we
need some information about whether this should be supported or not,
such that we can change MII_ADVERTISE accordingly to reflect that, and
what best way to do this than use one of these SUPPORTED_* bits to set
that, except, that unlike speed (which is both a MAC and PHY property),
Pause is exclusively MAC, yet, it transpires in PHYLIB.

MACs that I work with for instance need to be told to ignore pause
frames, or not ignore them, it all depends on what you want to
advertise/support.

> 
>> It does not, support for Pause frames is a MAC-level feature, yet,
>> PHYLIB (and that's been on my todo for a while now) insists on reporting
>> the confusing phydev->pause and phydev->asym_pause, which really is what
>> has been determined from auto-negotiating with your partner, as opposed
>> to being a supported thing or not. The PHY has absolutely not business
>> in that.
> 
> But there are pause frame bits in the MII_ADVERTISE register, and this
> setting directly impacts whether those bits are set.  I don't see how
> this is a MAC-level feature.

This is a MAC level feature, that needs to be auto-negotiated with your
link partner, which is why there is room for this in MII_ADVERTISE, but
the PHY absolutely does not participate in Pause frames, other than
passing them through the MAC, like normal frames. Whether your MAC acts
upon that or not is a MAC dependent feature.

> 
>> Your change should probably be in the Ethernet MAC driver, when you have
>> successfully connected to the PHY, update phydev->supported to include
>> SUPPORTED_Pause | SUPPORTED_Asym_pause.
> 
> The phylib documentation (networking/phy.txt) says:
> 
>  Now just make sure that phydev->supported and phydev->advertising have any
>  values pruned from them which don't make sense for your controller (a
> 10/100
>  controller may be connected to a gigabit capable PHY, so you would need to
>  mask off SUPPORTED_1000baseT*).  See include/linux/ethtool.h for
> definitions
>  for these bitfields.
> 
> and then it says:
> 
>> Note that you should not SET any bits, or the PHY may
>>  get put into an unsupported state.
> 
> So are you sure I'm supposed to set those bits?

Yes I am sure you should set these bits, the documentation is not
necessarily the authoritative source here (although it should). What
this probably meant to be written is something like: don't set any bits
that are not defined in ethtool.h, i.e: do not use this to store
persistent states.
-- 
Florian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-27 22:39     ` Florian Fainelli
@ 2016-10-28 20:06       ` Timur Tabi
  2016-10-28 21:03         ` Florian Fainelli
  0 siblings, 1 reply; 10+ messages in thread
From: Timur Tabi @ 2016-10-28 20:06 UTC (permalink / raw)
  To: Florian Fainelli, netdev, zefir.kurtisi, scampbel, alokc,
	shankerd, andrew

Florian Fainelli wrote:
> On 10/27/2016 03:24 PM, Timur Tabi wrote:
>> Florian Fainelli wrote:
>>
>>> Hu? In my experience that should not come from supporting Pause frames
>>> or not, but rather properly configuring a (RG)MII delay, but your
>>> mileage may vary.
>>
>> I can assure you, I'm more confused than you.  I've been working in this
>> for almost two weeks, and not only does this patch align with other phy
>> drivers, but it does fix my bug.  Without this patch, phylib does not
>> set the pause frame bits (10 and 11) in MII_ADVERTISE.
>
> And that's expected, but if your MAC does not act upon phydev->pause and
> phydev->asym_pause, then chances are that you can indeed lose packets
> every once in a while.

Can you give me more details on that?  I'm not really an expert on our 
MAC, so I'm not sure how it's supposed to work.  The MAC supports the 
concept of "flow control".  The non-upstream version of my driver reads 
PHY register 0x11, which apparently is a PHY-specific status register 
that says whether or not "transmit pause" and "receive pause" is 
enabled.  (The AT8031 datasheet is here, if you'd like to read it: 
http://www.datasheet-pdf.com/datasheet/AtherosCommunications/734720/AR8031.pdf.html).

If both are enabled in the PHY, then the driver enables the same 
features in the MAC.  Unfortunately, I don't have any documentation that 
explains that that really means.  I've tried enabling and disabling this 
feature in the MAC, and it makes no difference.  You would think that if 
the feature is disabled in both the MAC and the PHY, then no packets 
would be lost, but that's not the case.

> The part that is totally crappy about Pause frames and PHYLIB is that we
> need some information about whether this should be supported or not,
> such that we can change MII_ADVERTISE accordingly to reflect that, and
> what best way to do this than use one of these SUPPORTED_* bits to set
> that, except, that unlike speed (which is both a MAC and PHY property),
> Pause is exclusively MAC, yet, it transpires in PHYLIB.

Then what do those bits in register 0x11 do?  If I don't set them, I 
lose packets, no matter how my MAC is programmed.

It's like that joke, "Doctor, if I hold my arm like this, then it 
hurts!", "Well, then don't hold your arm like that."  If I don't program 
those PHY register bits, then my hardware doesn't work.

> MACs that I work with for instance need to be told to ignore pause
> frames, or not ignore them, it all depends on what you want to
> advertise/support.

That's the problem, I don't know what I "want", except that I want my 
hardware to work. :-)  99% of my knowledge of the hardware comes from 
scanning the source code of the internal version of the driver that 1) 
does not support phylib,  and 2) is hard-coded to initialize the Atheros 
8031 PHY.

>>> It does not, support for Pause frames is a MAC-level feature, yet,
>>> PHYLIB (and that's been on my todo for a while now) insists on reporting
>>> the confusing phydev->pause and phydev->asym_pause, which really is what
>>> has been determined from auto-negotiating with your partner, as opposed
>>> to being a supported thing or not. The PHY has absolutely not business
>>> in that.
>>
>> But there are pause frame bits in the MII_ADVERTISE register, and this
>> setting directly impacts whether those bits are set.  I don't see how
>> this is a MAC-level feature.
>
> This is a MAC level feature, that needs to be auto-negotiated with your
> link partner, which is why there is room for this in MII_ADVERTISE, but
> the PHY absolutely does not participate in Pause frames, other than
> passing them through the MAC, like normal frames. Whether your MAC acts
> upon that or not is a MAC dependent feature.

Ok, to me that means that the PHY driver must tell phylib whether or not 
it supports pause frames.  The question becomes:

1) Is my patch correct?
2) Is my patch necessary?
3) Is my patch sufficient?

1) I believe my patch is correct, because many other PHY drivers do the 
same thing for the same reason.  If the PHY supports register 4 bits 10 
and 11, then those SUPPORTED_xxx bits should be set.

2) I don't know why my patch appears to be necessary, because I don't 
understand why a 1Gb NIC on a 2Ghz processor drops frames.  I suspect 
the program is not performance but rather a mis-programming.

4) Is my patch sufficient?  The internal driver always enables pause 
frame support in the PHY if the PHY says that pause frames are enabled. 
  In fact, I can't even turn off flow control in the internal driver:

$ sudo ethtool -A eth1 tx off rx off
$ sudo ethtool -a eth1
Pause parameters for eth1:
Autonegotiate:  on
RX:             on
TX:             on

The driver insists on leaving flow control enabled if autonegotiation is 
enabled.

> Yes I am sure you should set these bits, the documentation is not
> necessarily the authoritative source here (although it should). What
> this probably meant to be written is something like: don't set any bits
> that are not defined in ethtool.h, i.e: do not use this to store
> persistent states.

So when I do that, it works.  I force those bits on in my driver, and 
pause frames are enabled and everything works correctly.

I still would like to know what those bits in register 4 really are for. 
  My understanding is that pause frames are an extension to the 802.3 
standard, and not every PHY supports them, so not even PHY supports 
setting bits 10 and 11 in register 4.  That's why the PHY driver should 
be setting them, not the MAC.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-28 20:06       ` Timur Tabi
@ 2016-10-28 21:03         ` Florian Fainelli
  2016-10-31 17:23           ` Timur Tabi
  0 siblings, 1 reply; 10+ messages in thread
From: Florian Fainelli @ 2016-10-28 21:03 UTC (permalink / raw)
  To: Timur Tabi, netdev, zefir.kurtisi, scampbel, alokc, shankerd, andrew

On 10/28/2016 01:06 PM, Timur Tabi wrote:
> Florian Fainelli wrote:
>> On 10/27/2016 03:24 PM, Timur Tabi wrote:
>>> Florian Fainelli wrote:
>>>
>>>> Hu? In my experience that should not come from supporting Pause frames
>>>> or not, but rather properly configuring a (RG)MII delay, but your
>>>> mileage may vary.
>>>
>>> I can assure you, I'm more confused than you.  I've been working in this
>>> for almost two weeks, and not only does this patch align with other phy
>>> drivers, but it does fix my bug.  Without this patch, phylib does not
>>> set the pause frame bits (10 and 11) in MII_ADVERTISE.
>>
>> And that's expected, but if your MAC does not act upon phydev->pause and
>> phydev->asym_pause, then chances are that you can indeed lose packets
>> every once in a while.
> 
> Can you give me more details on that?  I'm not really an expert on our
> MAC, so I'm not sure how it's supposed to work.  The MAC supports the
> concept of "flow control".  The non-upstream version of my driver reads
> PHY register 0x11, which apparently is a PHY-specific status register
> that says whether or not "transmit pause" and "receive pause" is
> enabled.  (The AT8031 datasheet is here, if you'd like to read it:
> http://www.datasheet-pdf.com/datasheet/AtherosCommunications/734720/AR8031.pdf.html).

How it works is that you read phydev->pause and phydev->asym_pause to
know what your link partner advertises in terms of flow control/pause
frame (they are the same thing here). Based on that, and your local
settings (from ethool -A) you advertise/enable flow control as well or
don't do it, and the result is either both agree on what is being used,
flow control is enabled, end-to-end, or it is not. It's exactly the same
thing as speed, except that there is a possibly to do asymetric things,
where flow control is only enabled in one direction or the other.

The way this works is this:

You transmit data, faster than your link partner can process it (busy
receiving, loaded, you name it), it sends back Pause frames back at you
(see below for how this is typically implemented), your MAC gets these
Pause frames back, which feed into the transmit logic, your TX
completion interrupt gets signaled slower than the actual data rate you
are pushing to your link partner, you call dev_kfree_skb() a little
slower from your TX completion handler, the networking stack knows this
means to back off, your transmission rate adjusts to your link partner's
processing capability, no packets get lost.

Conversely, if you were receiving a stream of packets that you have a
hard time keeping up with, the part of your Ethernet MAC that fills
system memory with packets should realize that it is producing them
faster than you are consuming them, this triggers the flow control
mechanism of your MAC receiver, and it sends Pause frames back at the
link partner.

> 
> 
> If both are enabled in the PHY, then the driver enables the same
> features in the MAC.  Unfortunately, I don't have any documentation that
> explains that that really means.  I've tried enabling and disabling this
> feature in the MAC, and it makes no difference.  You would think that if
> the feature is disabled in both the MAC and the PHY, then no packets
> would be lost, but that's not the case.

The PHY does not participate in flow control, other than passing frames
through (and your Atheros PHY may have a register to control that
passthrough, but that would be surprising, could not find one).

> 
>> The part that is totally crappy about Pause frames and PHYLIB is that we
>> need some information about whether this should be supported or not,
>> such that we can change MII_ADVERTISE accordingly to reflect that, and
>> what best way to do this than use one of these SUPPORTED_* bits to set
>> that, except, that unlike speed (which is both a MAC and PHY property),
>> Pause is exclusively MAC, yet, it transpires in PHYLIB.
> 
> Then what do those bits in register 0x11 do?  If I don't set them, I
> lose packets, no matter how my MAC is programmed.

This is just a status register, as you can see, all bits are read-only,
it just aggregates the resolution of what is advertised and what is
negotiated, which is not without value, but PHYLIB does the same
intersection here.

> 
> It's like that joke, "Doctor, if I hold my arm like this, then it
> hurts!", "Well, then don't hold your arm like that."  If I don't program
> those PHY register bits, then my hardware doesn't work.
> 
>> MACs that I work with for instance need to be told to ignore pause
>> frames, or not ignore them, it all depends on what you want to
>> advertise/support.
> 
> That's the problem, I don't know what I "want", except that I want my
> hardware to work. :-)  99% of my knowledge of the hardware comes from
> scanning the source code of the internal version of the driver that 1)
> does not support phylib,  and 2) is hard-coded to initialize the Atheros
> 8031 PHY.

May I suggest reading about standards a bit more, or just looking at
other drivers, like tg3.c.

> 
>>>> It does not, support for Pause frames is a MAC-level feature, yet,
>>>> PHYLIB (and that's been on my todo for a while now) insists on
>>>> reporting
>>>> the confusing phydev->pause and phydev->asym_pause, which really is
>>>> what
>>>> has been determined from auto-negotiating with your partner, as opposed
>>>> to being a supported thing or not. The PHY has absolutely not business
>>>> in that.
>>>
>>> But there are pause frame bits in the MII_ADVERTISE register, and this
>>> setting directly impacts whether those bits are set.  I don't see how
>>> this is a MAC-level feature.
>>
>> This is a MAC level feature, that needs to be auto-negotiated with your
>> link partner, which is why there is room for this in MII_ADVERTISE, but
>> the PHY absolutely does not participate in Pause frames, other than
>> passing them through the MAC, like normal frames. Whether your MAC acts
>> upon that or not is a MAC dependent feature.
> 
> Ok, to me that means that the PHY driver must tell phylib whether or not
> it supports pause frames.  The question becomes:
> 
> 1) Is my patch correct?
> 2) Is my patch necessary?
> 3) Is my patch sufficient?

Your patch is correct but also insufficient, although, as indicated
before, there is an misconception with PHY drivers that they should be
setting SUPPORTED_*Pause* bits, the MAC should do that, based on what it
does, anyway, but more importantly you need to have your Ethernet MAC
react properly upon what the auto-negotiation and locally configured
pause/flow control settings are, and configure the Ethernet MAC accordingly.

When you want pause frames to be enabled, you need to advertise them,
and in order to do so, you need to set phydev->supported to have
SUPPORTED_Pause and SUPPORTED_AsymPause, and call genphy_restart_aneg()
to transfer that to a write to the MII_ADVERTISE register, resulting in
your link partner to see that you now support Pause frames. Since pause
frames are now in use, you want your Ethernet MAC, not to ignore pause
frames (have flow control enabled).

> 
> 1) I believe my patch is correct, because many other PHY drivers do the
> same thing for the same reason.  If the PHY supports register 4 bits 10
> and 11, then those SUPPORTED_xxx bits should be set.
> 
> 2) I don't know why my patch appears to be necessary, because I don't
> understand why a 1Gb NIC on a 2Ghz processor drops frames.  I suspect
> the program is not performance but rather a mis-programming.
> 
> 4) Is my patch sufficient?  The internal driver always enables pause
> frame support in the PHY if the PHY says that pause frames are enabled.
>  In fact, I can't even turn off flow control in the internal driver:

Please stop abusing this "PHY says", the PHY advertises what it is being
told to advertise, by the MAC...

> 
> $ sudo ethtool -A eth1 tx off rx off
> $ sudo ethtool -a eth1
> Pause parameters for eth1:
> Autonegotiate:  on
> RX:             on
> TX:             on
> 
> The driver insists on leaving flow control enabled if autonegotiation is
> enabled.

Actually, the driver forces the enabling of flow control in the MAC,
period, but does not do anything with the auto-negotiation results, or I
could not spot it.

This was spotted in the review, but not addressed, but your adjust_link
callback seems completely bogus, it does a full MAC stop,
reconfiguration and restart, that is at best unnecessary and costly, but
it also misses a few things and does not act upon reading phydev->pause
and phydev->asym_pause to set or clear TXFC and RXFC, that really seems
to be your problem here.

Here is what could possibly go wrong right now:

- your Ethernet MAC unconditionally enables flow control at the MAC
level, but does not advertise support for that in MII_ADVERTISE until
you set SUPPORTED_Pause and SUPPORTED_Asym_Pause in the PHY driver, fair
enough

- the link partner you are testing against does not keep up well with
your transmit rates, but supports flow control, so Pause frames that it
sends back at you to tell you to stop transmitting so quickly just get
ignored, because not being advertised, you experience packet loss
-- 
Florian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-28 21:03         ` Florian Fainelli
@ 2016-10-31 17:23           ` Timur Tabi
  2016-10-31 17:55             ` Florian Fainelli
  0 siblings, 1 reply; 10+ messages in thread
From: Timur Tabi @ 2016-10-31 17:23 UTC (permalink / raw)
  To: Florian Fainelli, netdev, zefir.kurtisi, scampbel, alokc,
	shankerd, andrew

Florian Fainelli wrote:

> May I suggest reading about standards a bit more, or just looking at
> other drivers, like tg3.c.

I have been doing that for over six months now.  There's only so much I 
can glean from reading source code and standards documents.  The inner 
workings of our NIC are a mystery lost to time.

>> Ok, to me that means that the PHY driver must tell phylib whether or not
>> it supports pause frames.  The question becomes:
>>
>> 1) Is my patch correct?
>> 2) Is my patch necessary?
>> 3) Is my patch sufficient?
>
> Your patch is correct but also insufficient, although, as indicated
> before, there is an misconception with PHY drivers that they should be
> setting SUPPORTED_*Pause* bits,

If that's a misconception, then why is my patch correct?  It modifies 
the PHY driver to set those bits.  These drivers do the same thing:

bcm63xx.c
bcm7xxx.c
bcm-cygnus.c
broadcom.c
icplus.c
micrel.c
microchip.c
national.c
smsc.c
ste10Xp.c

 >  the MAC should do that, based on what it
> does, anyway, but more importantly you need to have your Ethernet MAC
> react properly upon what the auto-negotiation and locally configured
> pause/flow control settings are, and configure the Ethernet MAC accordingly.

Ok, so I should enable support for pause frames in my MAC if 
phydev->pause and/or phydev->asym_pause is set, and disable it otherwise?

If I read the spec correct (tx=MAC sends pause frames, rx=MAC 
understands received pause frames),

pause    asym_pause    enable tx?    enable rx?
-----    ----------    ----------    ----------
   0           0           No             No
   0           1           Yes            No
   1           0           Yes            Yes
   1           1           No             Yes

The last two seem backwards.  The internal driver enables RX and TX if 
pause and asym_pause is enabled.

> When you want pause frames to be enabled, you need to advertise them,
> and in order to do so, you need to set phydev->supported to have
> SUPPORTED_Pause and SUPPORTED_AsymPause, and call genphy_restart_aneg()
> to transfer that to a write to the MII_ADVERTISE register, resulting in
> your link partner to see that you now support Pause frames. Since pause
> frames are now in use, you want your Ethernet MAC, not to ignore pause
> frames (have flow control enabled).

Is there a way to enable pause frames without ethtool?  I intend to add 
ethtool support to my driver, but I want to fix this last bug first.

>> 1) I believe my patch is correct, because many other PHY drivers do the
>> same thing for the same reason.  If the PHY supports register 4 bits 10
>> and 11, then those SUPPORTED_xxx bits should be set.
>>
>> 2) I don't know why my patch appears to be necessary, because I don't
>> understand why a 1Gb NIC on a 2Ghz processor drops frames.  I suspect
>> the program is not performance but rather a mis-programming.
>>
>> 4) Is my patch sufficient?  The internal driver always enables pause
>> frame support in the PHY if the PHY says that pause frames are enabled.
>>   In fact, I can't even turn off flow control in the internal driver:
>
> Please stop abusing this "PHY says", the PHY advertises what it is being
> told to advertise, by the MAC...

Then I need help understanding why there are 10 other PHY drivers that 
set the SUPPORTED_Pause and SUPPORTED_Asym_Pause bits in their .features 
flags.

>> $ sudo ethtool -A eth1 tx off rx off
>> $ sudo ethtool -a eth1
>> Pause parameters for eth1:
>> Autonegotiate:  on
>> RX:             on
>> TX:             on
>>
>> The driver insists on leaving flow control enabled if autonegotiation is
>> enabled.
>
> Actually, the driver forces the enabling of flow control in the MAC,
> period, but does not do anything with the auto-negotiation results, or I
> could not spot it.

Sorry, the internal driver is very different that the external one.  It 
lacks phylib support.  You can see the internal driver here:

https://source.codeaurora.org/external/thundersoft/kernel/msm-3.10/tree/drivers/net/ethernet/msm/emac?h=caf/LA.BF.2.1.2_rb1.2

Flow control is enabled or disabled via the TXFC and RXFC bits in 
function emac_hw_config_fc():

https://source.codeaurora.org/external/thundersoft/kernel/msm-3.10/tree/drivers/net/ethernet/msm/emac/emac_hw.c?h=caf/LA.BF.2.1.2_rb1.2#n1306

> This was spotted in the review, but not addressed, but your adjust_link
> callback seems completely bogus, it does a full MAC stop,
> reconfiguration and restart,

I've been struggling to figure out what exactly the driver should do 
during adjust_link, although I don't see where it's doing a full 
restart.  If the link is up, it just calls emac_mac_start(), which just 
starts the MAC.

 > that is at best unnecessary and costly, but
> it also misses a few things and does not act upon reading phydev->pause
> and phydev->asym_pause to set or clear TXFC and RXFC, that really seems
> to be your problem here.

Now that I understand (barely) what phydev->pause and phydev->asym_pause 
do, I can modify emac_mac_start() to use those values to set TXFC and RXFC.

Unfortunately, that won't explain why my driver needs the PHY to pass 
those frames to avoid 10% packet loss.  You would think a 1Gb NIC on a 
24-core 2Ghz SOC could handle 900 Mbps.

Note that if I throttle my bandwidth to 90 Mbps, I still get packet 
loss.  However, if I switch the link from gigabit to 100Mbps, then I 
don't get packet loss.  So it's not the actual throughput that's the 
problem.  I get CRC errors in gigabit mode no matter what, if the pause 
frames are blocked by the PHY.

> Here is what could possibly go wrong right now:
>
> - your Ethernet MAC unconditionally enables flow control at the MAC
> level, but does not advertise support for that in MII_ADVERTISE until
> you set SUPPORTED_Pause and SUPPORTED_Asym_Pause in the PHY driver, fair
> enough
>
> - the link partner you are testing against does not keep up well with
> your transmit rates, but supports flow control, so Pause frames that it
> sends back at you to tell you to stop transmitting so quickly just get
> ignored, because not being advertised, you experience packet loss

The link partner is an e1000 NIC on an 8-core 3.6GHz PC .  Is it 
conceivable that I'm overloading it?

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-27 22:05 [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames Timur Tabi
  2016-10-27 22:12 ` Florian Fainelli
@ 2016-10-31 17:48 ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2016-10-31 17:48 UTC (permalink / raw)
  To: timur
  Cc: netdev, zefir.kurtisi, scampbel, alokc, shankerd, andrew, f.fainelli

From: Timur Tabi <timur@codeaurora.org>
Date: Thu, 27 Oct 2016 17:05:01 -0500

> The Atheros 8031 PHY supports the 802.3 extension for symmetric and
> asymmetric pause frames, so set that to the list of features supported
> by the phy.
> 
> Signed-off-by: Timur Tabi <timur@codeaurora.org>

It looks like Florian and you need to discuss this a little further
but here are some comments on my part.

First of all the PHY state for pause is merely a control for what gets
advertised in negotiation and the result after negotiation completes,
and that's about it.  Maybe is has an influence upon whether PAUSE
frames are passed to/from the MAC, but that would be the largest
extent of it even if so.

The MAC does all of the actual PAUSE processing.  When the MAC sees a
PAUSE frame is backs off it's transmitter.  When the amount of unused
RX buffers in it's ring gets very low, the MAC emits a PAUSE frame.

You also mentioned that you were surprised that getting 900MBit on a
multi-core 2GHZ ARM without drops isn't happening.  Well, this is a
very complex issue to analyze.  I can only give a few pointers after
taking a quick look at this out-of-tree driver.

Are you testing single-flow performance?  If so, even though this is a
multi-queue NIC the traffic will be going over only one of the queues
and thus all of those other cores are basically wasted, because only
one core will be processing this flow's packets.

Next, there are probably a lot of batching optimizations missing from
the driver.  For example, unconditionally always posting replenished
RX buffers ever time you process the RX ring is expensive.  Especially
expensive is the MMIO write to post the new RX buffers.  You should
batch them and only perform the MMIO write when say 8 or more new
RX buffers have been posted.

This is a pretty common optimization if you look at other drivers.

Next, the DMA map/unmap operations could be (relatively) expensive
on this platform and contribute to what packet rates are possible
without drops.

But all of this is speculation, you really need to look at "perf"
output to see if the kernel is spending an excessive amount of time in
one place or another during your tests.  At least this way you'll have
some hard data to work with and have some kind of idea what might be
the reason.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-31 17:23           ` Timur Tabi
@ 2016-10-31 17:55             ` Florian Fainelli
  2016-10-31 21:14               ` Timur Tabi
  0 siblings, 1 reply; 10+ messages in thread
From: Florian Fainelli @ 2016-10-31 17:55 UTC (permalink / raw)
  To: Timur Tabi, netdev, zefir.kurtisi, scampbel, alokc, shankerd, andrew

On 10/31/2016 10:23 AM, Timur Tabi wrote:
> Florian Fainelli wrote:
> 
>> May I suggest reading about standards a bit more, or just looking at
>> other drivers, like tg3.c.
> 
> I have been doing that for over six months now.  There's only so much I
> can glean from reading source code and standards documents.  The inner
> workings of our NIC are a mystery lost to time.
> 
>>> Ok, to me that means that the PHY driver must tell phylib whether or not
>>> it supports pause frames.  The question becomes:
>>>
>>> 1) Is my patch correct?
>>> 2) Is my patch necessary?
>>> 3) Is my patch sufficient?
>>
>> Your patch is correct but also insufficient, although, as indicated
>> before, there is an misconception with PHY drivers that they should be
>> setting SUPPORTED_*Pause* bits,
> 
> If that's a misconception, then why is my patch correct?  It modifies
> the PHY driver to set those bits.  These drivers do the same thing:

Your patch is not correct nor incorrect, but results in PHYLIB to
advertise pause frames to your link partner, and since you also have
flow control enabled, this produces the desired effect of getting what
you need, and the state is consistend. But as indicated before, this is
not how you are supposed to fix that.

> 
> bcm63xx.c
> bcm7xxx.c
> bcm-cygnus.c
> broadcom.c
> icplus.c
> micrel.c
> microchip.c
> national.c
> smsc.c
> ste10Xp.c

Ever heard of cargo cult programming?

> 
>>  the MAC should do that, based on what it
>> does, anyway, but more importantly you need to have your Ethernet MAC
>> react properly upon what the auto-negotiation and locally configured
>> pause/flow control settings are, and configure the Ethernet MAC
>> accordingly.
> 
> Ok, so I should enable support for pause frames in my MAC if
> phydev->pause and/or phydev->asym_pause is set, and disable it otherwise?

You don't read my emails do you? By setting phydev->supported and
phydev->features to include SUPPORTED_Pause and SUPPORTED_AsymPause when
you have *connected* to the PHY.

> 
> If I read the spec correct (tx=MAC sends pause frames, rx=MAC
> understands received pause frames),
> 
> pause    asym_pause    enable tx?    enable rx?
> -----    ----------    ----------    ----------
>   0           0           No             No
>   0           1           Yes            No
>   1           0           Yes            Yes
>   1           1           No             Yes
> 
> The last two seem backwards.  The internal driver enables RX and TX if
> pause and asym_pause is enabled.
> 
>> When you want pause frames to be enabled, you need to advertise them,
>> and in order to do so, you need to set phydev->supported to have
>> SUPPORTED_Pause and SUPPORTED_AsymPause, and call genphy_restart_aneg()
>> to transfer that to a write to the MII_ADVERTISE register, resulting in
>> your link partner to see that you now support Pause frames. Since pause
>> frames are now in use, you want your Ethernet MAC, not to ignore pause
>> frames (have flow control enabled).
> 
> Is there a way to enable pause frames without ethtool?  I intend to add
> ethtool support to my driver, but I want to fix this last bug first.

Auto-negotiation, with a forced policy to have flow control enabled is
certainly one of these cases.

> 
>>> 1) I believe my patch is correct, because many other PHY drivers do the
>>> same thing for the same reason.  If the PHY supports register 4 bits 10
>>> and 11, then those SUPPORTED_xxx bits should be set.
>>>
>>> 2) I don't know why my patch appears to be necessary, because I don't
>>> understand why a 1Gb NIC on a 2Ghz processor drops frames.  I suspect
>>> the program is not performance but rather a mis-programming.
>>>
>>> 4) Is my patch sufficient?  The internal driver always enables pause
>>> frame support in the PHY if the PHY says that pause frames are enabled.
>>>   In fact, I can't even turn off flow control in the internal driver:
>>
>> Please stop abusing this "PHY says", the PHY advertises what it is being
>> told to advertise, by the MAC...
> 
> Then I need help understanding why there are 10 other PHY drivers that
> set the SUPPORTED_Pause and SUPPORTED_Asym_Pause bits in their .features
> flags.

Cargo cult programming, replicating something you thought was right, or
did the job at the time, did not look back on it, and nobody cared
fixing it.

> 
>>> $ sudo ethtool -A eth1 tx off rx off
>>> $ sudo ethtool -a eth1
>>> Pause parameters for eth1:
>>> Autonegotiate:  on
>>> RX:             on
>>> TX:             on
>>>
>>> The driver insists on leaving flow control enabled if autonegotiation is
>>> enabled.
>>
>> Actually, the driver forces the enabling of flow control in the MAC,
>> period, but does not do anything with the auto-negotiation results, or I
>> could not spot it.
> 
> Sorry, the internal driver is very different that the external one.  It
> lacks phylib support.  You can see the internal driver here:
> 
> https://source.codeaurora.org/external/thundersoft/kernel/msm-3.10/tree/drivers/net/ethernet/msm/emac?h=caf/LA.BF.2.1.2_rb1.2

I have no interest in looking into something that is not upstream.

> 
> 
> Flow control is enabled or disabled via the TXFC and RXFC bits in
> function emac_hw_config_fc():
> 
> https://source.codeaurora.org/external/thundersoft/kernel/msm-3.10/tree/drivers/net/ethernet/msm/emac/emac_hw.c?h=caf/LA.BF.2.1.2_rb1.2#n1306
> 
> 
>> This was spotted in the review, but not addressed, but your adjust_link
>> callback seems completely bogus, it does a full MAC stop,
>> reconfiguration and restart,
> 
> I've been struggling to figure out what exactly the driver should do
> during adjust_link, although I don't see where it's doing a full
> restart.  If the link is up, it just calls emac_mac_start(), which just
> starts the MAC.

"just starts" the MAC is not quite what emac_mac_start() does, it does a
lot more than just programming the duplex and flow control settings, it
also deals with receiver/transmitter enable, touching some queue
configuration, interrupt etc., programs a handful of MAC bits relating
to packet processing etc.. all of this should be unnecessary at the time
you react to a link change, and presumably also needs to be done in a
controlled manner, as the hardware may be acting on a packet boundary
(you may have to wait for a full packet to drain before doing this or that).

All of this is much more than what is expected for a link up/down event.

> 
>> that is at best unnecessary and costly, but
>> it also misses a few things and does not act upon reading phydev->pause
>> and phydev->asym_pause to set or clear TXFC and RXFC, that really seems
>> to be your problem here.
> 
> Now that I understand (barely) what phydev->pause and phydev->asym_pause
> do, I can modify emac_mac_start() to use those values to set TXFC and RXFC.
> 
> Unfortunately, that won't explain why my driver needs the PHY to pass
> those frames to avoid 10% packet loss.  You would think a 1Gb NIC on a
> 24-core 2Ghz SOC could handle 900 Mbps.
> 
> Note that if I throttle my bandwidth to 90 Mbps, I still get packet
> loss.  However, if I switch the link from gigabit to 100Mbps, then I
> don't get packet loss.  So it's not the actual throughput that's the
> problem.  I get CRC errors in gigabit mode no matter what, if the pause
> frames are blocked by the PHY.

Then really this needs more investigation from your side instead of just
enabling pause frames and making the problem go away.

> 
>> Here is what could possibly go wrong right now:
>>
>> - your Ethernet MAC unconditionally enables flow control at the MAC
>> level, but does not advertise support for that in MII_ADVERTISE until
>> you set SUPPORTED_Pause and SUPPORTED_Asym_Pause in the PHY driver, fair
>> enough
>>
>> - the link partner you are testing against does not keep up well with
>> your transmit rates, but supports flow control, so Pause frames that it
>> sends back at you to tell you to stop transmitting so quickly just get
>> ignored, because not being advertised, you experience packet loss
> 
> The link partner is an e1000 NIC on an 8-core 3.6GHz PC .  Is it
> conceivable that I'm overloading it?

Yes it is, and if you are not directly connected to it, maybe it is the
switch in between, depending on how it deals with flow control you could
run into a problem with the switch. Not all switches support end-to-end
pause frames, and will terminate it locally, and then try to also apply
flow control with the otehr link partner: your e1000 NIC on the other PC.

Flow control issues are kind of difficult to resolve for sure, but
David's reply from a few minutes ago is also how I would approach the
problem.
-- 
Florian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames
  2016-10-31 17:55             ` Florian Fainelli
@ 2016-10-31 21:14               ` Timur Tabi
  0 siblings, 0 replies; 10+ messages in thread
From: Timur Tabi @ 2016-10-31 21:14 UTC (permalink / raw)
  To: Florian Fainelli, netdev, zefir.kurtisi, scampbel, alokc,
	shankerd, andrew

Florian Fainelli wrote:
>>pause    asym_pause    enable tx?    enable rx?
>>-----    ----------    ----------    ----------
>>   0           0           No             No
>>   0           1           Yes            No
>>   1           0           Yes            Yes
>>   1           1           No             Yes
>>
>>The last two seem backwards.  The internal driver enables RX and TX if
>>pause and asym_pause is enabled.

Is this chart correct?  I see drivers handle this differently.

nb8800_pause_config() has this:

	priv->pause_rx = phydev->pause;
	priv->pause_tx = phydev->pause ^ phydev->asym_pause;

But bcm_sf2_sw_adjust_link() does this:

	if (phydev->pause) {
		if (phydev->asym_pause)
			reg |= TX_PAUSE_EN;
		reg |= RX_PAUSE_EN;
	}

Which one is correct?

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-10-31 21:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-27 22:05 [PATCH] net: phy: at803x: the Atheros 8031 supports pause frames Timur Tabi
2016-10-27 22:12 ` Florian Fainelli
2016-10-27 22:24   ` Timur Tabi
2016-10-27 22:39     ` Florian Fainelli
2016-10-28 20:06       ` Timur Tabi
2016-10-28 21:03         ` Florian Fainelli
2016-10-31 17:23           ` Timur Tabi
2016-10-31 17:55             ` Florian Fainelli
2016-10-31 21:14               ` Timur Tabi
2016-10-31 17:48 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.