All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Petr Tesarik <ptesarik@suse.cz>, Heiner Kallweit <hkallweit1@gmail.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>,
	netdev@vger.kernel.org, linux-clk@vger.kernel.org
Subject: Re: RTL8402 stops working after hibernate/resume
Date: Tue, 29 Sep 2020 22:08:56 +0200	[thread overview]
Message-ID: <217ae37d-f2b0-1805-5696-11644b058819@redhat.com> (raw)
In-Reply-To: <20200929210737.7f4a6da7@ezekiel.suse.cz>

Hi,

On 9/29/20 9:07 PM, Petr Tesarik wrote:
> Hi Heiner (and now also Hans)!
> 
> @Hans: I'm adding you to this conversation, because you're the author
> of commit b1e3454d39f99, which seems to break the r8169 driver on a
> laptop of mine.

Erm, no, as you bi-sected yourself already commit 9f0b54cd167219
("r8169: move switching optional clock on/off to pll power functions")

Broke your laptop, commit b1e3454d39f99 ("clk: x86: add "ether_clk" alias
for Bay Trail / Cherry Trail") is about 18 months older.

> On Fri, 25 Sep 2020 16:47:54 +0200
> Heiner Kallweit <hkallweit1@gmail.com> wrote:
> 
>> On 25.09.2020 14:56, Petr Tesarik wrote:
>>> On Fri, 25 Sep 2020 11:52:41 +0200
>>> Petr Tesarik <ptesarik@suse.cz> wrote:
>>>    
>>>> On Fri, 25 Sep 2020 11:44:09 +0200
>>>> Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>   
>>>>> On 25.09.2020 10:54, Petr Tesarik wrote:
>>>> [...]
>>>>>> Does it make sense to bisect the change that broke the driver for me, or should I rather dispose of this waste^Wlaptop in an environmentally friendly manner? I mean, would you eventually accept a workaround for a few machines with a broken BIOS?
>>>>>>        
>>>>> If the workaround is small and there's little chance to break other stuff: then usually yes.
>>>>> If you can spend the effort to bisect the issue, this would be appreciated.
>>>>
>>>> OK, then I'm going to give it a try.
>>>
>>> Done. The system freezes when this commit is applied:
>>>
>>> commit 9f0b54cd167219266bd3864570ae8f4987b57520
>>> Author: Heiner Kallweit <hkallweit1@gmail.com>
>>> Date:   Wed Jun 17 22:55:40 2020 +0200
>>>
>>>      r8169: move switching optional clock on/off to pll power functions
>>>    
>> This sounds weird. On your system tp->clk should be NULL,

Heiner, why do you say that tp->clk should be NULL on Petr's
system? Because it is an x86 based system?

Some X86 SoCs, specifically, the more tablet oriented Bay and Cherry
Trail SoCs, which are much more ARM SoC like then other X86 SoCs do
also use the clock framework and the SoC has a number of external clk
pins which are typically used by audio codecs and by ethernet chips.

>> making
>> clk_prepare_enable() et al no-ops. Please check whether tp->clk
>> is NULL after the call to rtl_get_ether_clk().
> 
> This might be part of the issue. On my system tp->clk is definitely not
> NULL:
> 
> crash> *rtl8169_private.clk 0xffff9277aca58940
>    clk = 0xffff9277ac2c82a0
> 
> crash> *clk 0xffff9277ac2c82a0
> struct clk {
>    core = 0xffff9277aef65d00,
>    dev = 0xffff9277aed000b0,
>    dev_id = 0xffff9277aec60c00 "0000:03:00.2",
>    con_id = 0xffff9277ad04b080 "ether_clk",
>    min_rate = 0,
>    max_rate = 18446744073709551615,
>    exclusive_count = 0,
>    clks_node = {
>      next = 0xffff9277ad2428d8,
>      pprev = 0xffff9277aef65dc8
>    }
> }
> 
> The dev_id corresponds to the Ethernet controller:
> 
> 03:00.2 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL810xE PCI Express Fast Ethernet controller (rev 06)
> 
> Looking at clk_find(), it matches this entry in clocks:
> 
> struct clk_lookup {
>    node = {
>      next = 0xffffffffbc702f40,
>      prev = 0xffff9277bf7190c0
>    },
>    dev_id = 0x0,
>    con_id = 0xffff9277bf719524 "ether_clk",
>    clk = 0x0,
>    clk_hw = 0xffff9277ad2427f8
> }
> 
> That's because this kernel is built with CONFIG_PMC_ATOM=y, and looking
> at the platform initialization code, the "ether_clk" alias is created
> unconditionally. Hans already added.

Petr, unconditionally is not really correct here, just as claiming
above that my commit broke things was not really correct either.

I guess this is mostly semantics, but I don't appreciate
the accusatory tone here.

The code in question binds to a clk-pmc-atom platform_device which
gets instantiated by drivers/platform/x86/pmc_atom.c. Which in turn
binds to a PCI device which is only present on Bay Trail and Cherry
Trail SoCs.

IOW the commit operates as advertised in its Subject:
"clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail"

So with that all clarified lets try to see if we can figure out
*why* this is actually happening.

Petr, can you describe your hardware in some more detail,
in the bits quoted when you first Cc-ed me there is not that
much detail. What is the vendor and model of your laptop?

Looking closer at commit 9f0b54cd167219
("r8169: move switching optional clock on/off to pll power functions")
I notice that the functions which now enable/disable the clock:
rtl_pll_power_up() and rtl_pll_power_down()

Only run when the interface is up during suspend/resume.
Petr, I guess the laptop is not connected to ethernet when you
hibernate it?

That means that on resume the clock will not be re-enabled.

This is a subtle but important change and I believe that
this is what is breaking things. I guess that the PLL which
rtl_pll_power_up() / rtl_pll_power_down() controls is only
used for ethernet-timing.  But the external clock controlled
through pt->clk is a replacement for using an external
crystal with the r8169 chip. So with it disabled, the entire
chip does not have a clock and is essentially dead.
It can then e.g. not respond to any pci-e reads/writes done
to it.

So I believe that the proper fix for this is to revert
commit 9f0b54cd167219
("r8169: move switching optional clock on/off to pll power functions")

As that caused the whole chip's clock to be left off after
a suspend/resume while the interface is down.

Also some remarks about this while I'm being a bit grumpy about
all this anyways (sorry):

1. 9f0b54cd167219 ("r8169: move switching optional clock on/off
to pll power functions") commit's message does not seem to really
explain why this change was made...

2. If a git blame would have been done to find the commit adding
the clk support: commit c2f6f3ee7f22 ("r8169: Get and enable optional ether_clk clock")
then you could have known that the clk in question is an external
clock for the entire chip, the commit message pretty clearly states
this (although "the entire" part is implied only) :

"On some boards a platform clock is used as clock for the r8169 chip,
this commit adds support for getting and enabling this clock (assuming
it has an "ether_clk" alias set on it).

This is related to commit d31fd43c0f9a ("clk: x86: Do not gate clocks
enabled by the firmware") which is a previous attempt to fix this for some
x86 boards, but this causes all Cherry Trail SoC using boards to not reach
there lowest power states when suspending.

This commit (together with an atom-pmc-clk driver commit adding the alias)
fixes things properly by making the r8169 get the clock and enable it when
it needs it."

Regards,

Hans


  parent reply	other threads:[~2020-09-29 20:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15  8:28 RTL8402 stops working after hibernate/resume Petr Tesarik
2020-07-15 15:22 ` Heiner Kallweit
2020-07-16  8:58   ` Petr Tesarik
2020-07-18 12:07     ` Heiner Kallweit
2020-09-03  8:41       ` Petr Tesarik
2020-09-17 14:10         ` Petr Tesarik
2020-09-23  9:57         ` Heiner Kallweit
2020-09-24 19:14           ` Petr Tesarik
2020-09-24 19:37             ` Heiner Kallweit
2020-09-24 20:12             ` Heiner Kallweit
2020-09-25  7:30               ` Petr Tesarik
2020-09-25  8:54                 ` Petr Tesarik
2020-09-25  9:44                   ` Heiner Kallweit
2020-09-25  9:52                     ` Petr Tesarik
2020-09-25 12:56                       ` Petr Tesarik
2020-09-25 14:47                         ` Heiner Kallweit
2020-09-29 19:07                           ` Petr Tesarik
2020-09-29 19:41                             ` Heiner Kallweit
2020-09-29 20:08                             ` Hans de Goede [this message]
2020-09-29 20:29                               ` Hans de Goede
2020-09-29 20:35                               ` Heiner Kallweit
2020-09-29 20:42                                 ` Heiner Kallweit
2020-09-30  9:04                                 ` Hans de Goede
2020-09-30 15:47                                   ` Heiner Kallweit
2020-09-30 16:41                                     ` Petr Tesarik
2020-09-30 17:13                                       ` Heiner Kallweit
2020-09-30 17:32                                       ` Petr Tesarik
2020-09-30 18:00                                         ` Petr Tesarik
2020-09-30 20:11                                           ` Heiner Kallweit
2020-09-30 21:44                                             ` Petr Tesarik
2020-10-01  6:14                                               ` Heiner Kallweit
2020-09-30 18:10                                     ` Petr Tesarik
2020-09-29 21:17                               ` Petr Tesarik
2020-07-15 15:25 ` Heiner Kallweit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=217ae37d-f2b0-1805-5696-11644b058819@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=hkallweit1@gmail.com \
    --cc=linux-clk@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nic_swsd@realtek.com \
    --cc=ptesarik@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.