linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Diogo Ivo <diogo.ivo@siemens.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: vkoul@kernel.org, kishon@kernel.org,
	linux-phy@lists.infradead.org, tjoseph@cadence.com,
	linux-pci@vger.kernel.org, ylal@codeaurora.org,
	regressions@lists.linux.dev, jan.kiszka@siemens.com,
	diogo.ivo@siemens.com
Subject: Re: [REGRESSION] Keystone PCI driver probing and SerDes PLL timeout
Date: Fri, 12 Jan 2024 11:46:46 +0000	[thread overview]
Message-ID: <1cce86b1-b309-40e0-afff-45baeb013e1f@siemens.com> (raw)
In-Reply-To: <2024011246-corned-disregard-7123@gregkh>


On 1/12/24 07:57, Greg KH wrote:
> On Thu, Jan 11, 2024 at 02:13:30PM +0000, Diogo Ivo wrote:
>> Hello,
>>
>> When testing the IOT2050 Advanced M.2 platform with Linux CIP 6.1
>> we came across a breakage in the probing of the Keystone PCI driver
>> (drivers/phy/ti/pci-keystone.c). This probing was working correctly
>> in the previous version we were using, v5.10.
>>
>> In order to debug this we changed over to mainline Linux and bissecting
>> lead us to find that commit e611f8cd8717 is the culprit, and with it applied
>> we get the following messages:
>>
>> [   10.954597] phy-am654 910000.serdes: Failed to enable PLL
>> [   10.960153] phy phy-910000.serdes.3: phy poweron failed --> -110
>> [   10.967485] keystone-pcie 5500000.pcie: failed to enable phy
>> [   10.973560] keystone-pcie: probe of 5500000.pcie failed with error -110
>>
>> This timeout is occuring in serdes_am654_enable_pll(), called from the
>> phy_ops .power_on() hook.
>>
>> Due to the nature of the error messages and the contents of the commit we
>> believe that this is due to an unidentified race condition in the probing of
>> the Keystone PCI driver when enabling the PHY PLLs, since changes in the
>> workqueue the deferred probing runs on should not affect if probing works
>> or not. To further support the existence of a race condition, commit
>> 86bfbb7ce4f6 (a scheduler commit) fixes probing, most likely unintentionally
>> meaning that the problem may arise in the future again.
>>
>> One possible explanation is that there are pre-requisites for enabling the PLL
>> that are not being met when e611f8cd8717 is applied; to see if this is the case
>> help from people more familiar with the hardware details would be useful.
>>
>> As official support specifically for the IOT2050 Advanced M.2 platform was
>> introduced in Linux v6.3 (so in the middle of the commits mentioned above)
>> all of our testing was done with the latest mainline DeviceTree with [1]
>> applied on top.
>>
>> This is being reported as a regression even though technically things are
>> working with the current state of mainline since we believe the current fix
>> to be an unintended by-product of other work.
>>
>> #regzbot introduced: e611f8cd8717
> A "regression" for a commit that was in 5.13, i.e. almost 2 years ago,
> is a bit tough, and not something I would consider really a "regression"
> as it is core code that everyone runs.  Given you point at scheduler
> changes also fixing the issue, this seems like a hint as to what is
> wrong with your driver/platform, but is not the root cause of it and
> needs to be resolved.  Please look at fixing it in your drivers?  Are
> they all in Linus's tree?
>
> thanks,
>
> greg k-h
Hello,

I see the point that this code has been living in the kernel for a
long time now and that it becomes more difficult to justify it as
a regression; I reported it as such based on the supposition that
the current fix is not the proper one and that technically this
support was broken between the identified commits.

If this situation is incompatible with a regression report then it
can be dropped as one and we keep it is as a bug report for which
we are looking for input from the community.

I agree that this needs to be fixed in the driver since all other
drivers are working fine with e611f8cd8717, and yes, all of the
drivers in question are in mainline, where we performed the bissection.

Thank you,

Diogo Ivo

  reply	other threads:[~2024-01-12 11:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-11 14:13 [REGRESSION] Keystone PCI driver probing and SerDes PLL timeout Diogo Ivo
2024-01-12  7:57 ` Greg KH
2024-01-12 11:46   ` Diogo Ivo [this message]
2024-01-12 12:51     ` Linux regression tracking (Thorsten Leemhuis)
2024-01-22  5:52     ` Kishon Vijay Abraham I
2024-01-22  6:13       ` Siddharth Vadapalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1cce86b1-b309-40e0-afff-45baeb013e1f@siemens.com \
    --to=diogo.ivo@siemens.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jan.kiszka@siemens.com \
    --cc=kishon@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-phy@lists.infradead.org \
    --cc=regressions@lists.linux.dev \
    --cc=tjoseph@cadence.com \
    --cc=vkoul@kernel.org \
    --cc=ylal@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).