linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Vinod Koul <vkoul@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linuxarm@huawei.com, mauro.chehab@huawei.com,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Kishon Vijay Abraham I <kishon@ti.com>,
	Manivannan Sadhasivam <mani@kernel.org>,
	Rob Herring <robh@kernel.org>,
	linux-kernel@vger.kernel.org, linux-phy@lists.infradead.org
Subject: Re: [PATCH v11 01/11] phy: HiSilicon: Add driver for Kirin 970 PCIe PHY
Date: Wed, 18 Aug 2021 12:08:02 +0200	[thread overview]
Message-ID: <20210818120802.3ba5fd59@coco.lan> (raw)
In-Reply-To: <20210818110123.33eba838@coco.lan>

Em Wed, 18 Aug 2021 11:01:23 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Hi Vinod,
> 
> Em Tue, 17 Aug 2021 16:12:37 +0530
> Vinod Koul <vkoul@kernel.org> escreveu:
> 
> > > +	/* FIXME: calling it causes an Asynchronous SError interrupt */
> > > +//	kirin_pcie_clk_ctrl(phy, false);    
> > 
> > when will you fix the fixme and pls remove the deadcode  
> 
> Working with clocks on this SoC is very tricky: there are lots of clock
> lines (~70) that are critical for this device to work. Such lines are
> enabled via the Device's firmware, and are supposed to be always
> powered. Powering off such clock lines cause a SError.
> 
> Most clocks on this device are managed by the clk-hikey3670 driver.
> At the current state of clk-hi3670, the only way for HiKey 970
> to even boot is to add:
> 
> 	clk_ignore_unused=true
> 
> as a Kernel boot parameter. That is the solution given by the downstream
> official distributions for HiKey970 at 96boards.
> 
> The fix is to flag the critical clocks with CLK_IS_CRITICAL at the
> clk-hi3670 driver, but finding the right clock set has been a challenge.
> 
> I spent the last couple of weeks trying to identify the critical ones,
> as I'm aiming to be able to use a Kernel built with a default arm64
> one of my goals is to have this device working fine with a
> "make defconfig" Kernel.
> 
> So, I added this patch:
> 
> 	https://lore.kernel.org/lkml/2d2de5e902ced072bcfd5e5311d6b10326b9245b.1627041240.git.mchehab+huawei@kernel.org/
> 
> to my tree (which reduces the set of clocks using CLK_IGNORE_UNUSED
> from 308 to 163 clocks). Than I ran script that was dropping the
> flag one by one, boots the new Kernel and do a sanity check. When it 
> fails to boot, I manually dropped the patch, and re-run the script
> to test the remaining clocks. After a couple of weeks, I reached a patch
> with 78 clock lines that seemed critical, but the resulting patch was
> not stable, as, depending on the day I boot the Kernel with such patch,
> it crashes with SError in a couple of seconds after booting, or 
> cause the Ethernet firmware to not load.
> 
> I intend to keep trying to find the clock lines that can't be disabled,
> but this is very time consuming, as I couldn't find any documentation
> about that. So, it has to be done empirically.
> 
> -
> 
> In any case, fixing it doesn't sound a critical issue for the PHY
> driver. I mean, right now, this patchset allows removing and 
> re-inseting the PCIe driver, which is already an improvement over the
> original upstream driver, which was missing the power-off logic for
> Kirin 960.
> 
> With this patchset, both power-off/power-on logic for both HiKey960
> (where the PHY is inside the pcie-kirin driver) and for HiKey970,
> which uses this PHY driver. On both devices, I tested an endless loop 
> with rmmod/modprobe for the PCIe.
> 
> Besides that, in practice, removing PCIe in runtime is something that
> people usually don't do.
> 
> So, while it would be cool to balance the clock disable logic,
> I don't think this is a critical issue in this particular case.
> 
> Thanks,
> Mauro

Btw, this is one of such panic errors:

[    4.468948] hi3670_pcie_phy fc000000.pcie-phy: PIPE clk is not stable
[    4.522530] SError Interrupt on CPU4, code 0xbf000002 -- SError
[    4.522535] CPU: 4 PID: 223 Comm: systemd-udevd Not tainted 5.14.0-rc1+ #370
[    4.522537] Hardware name: HiKey970 (DT)
[    4.522539] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO BTYPE=--)
[    4.522540] pc : el1_interrupt+0x20/0x80
[    4.522542] lr : el1h_64_irq_handler+0x18/0x24
[    4.522543] sp : ffff800012903610
[    4.522543] x29: ffff800012903610 x28: ffff000108410e40 x27: ffff0001bf3e4100
[    4.522551] x26: 0000000000000000 x25: 0000000000000000 x24: ffff0001009dec10
[    4.522554] x23: 0000000040000005 x22: ffff800010ed1330 x21: ffff800012903790
[    4.522556] x20: ffff8000104e42e0 x19: ffff800012903640 x18: 0000000000000000
[    4.522559] x17: 0000000000000000 x16: 0000000000000000 x15: 0763072007450750
[    4.522563] x14: 074907500720073a x13: ffff0001b87e0000 x12: 000000000000053a
[    4.522565] x11: 00000000000001be x10: ffff0001bf2386c0 x9 : 00000000ffff0000
[    4.522568] x8 : ffff0001b87e0000 x7 : ffff0001bf2386c0 x6 : 0000000000000000
[    4.522571] x5 : ffff00010370aac0 x4 : ffff000108410e40 x3 : ffff800011f20cd8
[    4.522573] x2 : ffff000108410e40 x1 : 00000000000000c0 x0 : ffff800012903640
[    4.522577] Kernel panic - not syncing: Asynchronous SError Interrupt
[    4.522578] CPU: 4 PID: 223 Comm: systemd-udevd Not tainted 5.14.0-rc1+ #370
[    4.522579] Hardware name: HiKey970 (DT)
[    4.522579] Call trace:
[    4.522580]  dump_backtrace+0x0/0x1e0
[    4.522581]  show_stack+0x18/0x24
[    4.522581]  dump_stack_lvl+0x68/0x84
[    4.522582]  dump_stack+0x18/0x34
[    4.522583]  panic+0x16c/0x334
[    4.522583]  nmi_panic+0x8c/0x90
[    4.522584]  arm64_serror_panic+0x78/0x84
[    4.522585]  do_serror+0x58/0x5c
[    4.522586]  el1h_64_error_handler+0x30/0x50
[    4.522586]  el1h_64_error+0x78/0x7c
[    4.522588]  el1_interrupt+0x20/0x80
[    4.522588]  el1h_64_irq_handler+0x18/0x24
[    4.522589]  el1h_64_irq+0x78/0x7c
[    4.522590]  mutex_lock_io+0xf0/0x370
[    4.522591]  clk_unprepare+0x28/0x50
[    4.522591]  kirin_pcie_clk_ctrl+0x164/0x1a0 [phy_hi3670_pcie]
[    4.522592]  hi3670_pcie_phy_power_on+0x720/0xb00 [phy_hi3670_pcie]
[    4.522593]  phy_power_on+0x78/0x130
[    4.522594]  kirin_pcie_probe+0x6a8/0x88c [pcie_kirin]
[    4.522595]  platform_probe+0x68/0xe0
[    4.522596]  really_probe+0x1b0/0x42c
[    4.522596]  __driver_probe_device+0x114/0x190
[    4.522597]  driver_probe_device+0x40/0x100
[    4.522598]  __driver_attach+0xcc/0x1e0
[    4.522599]  bus_for_each_dev+0x70/0xd0
[    4.522600]  driver_attach+0x24/0x30
[    4.522601]  bus_add_driver+0x140/0x234
[    4.522601]  driver_register+0x78/0x130
[    4.522602]  __platform_driver_register+0x28/0x34
[    4.522603]  kirin_pcie_driver_init+0x24/0x1000 [pcie_kirin]
[    4.522604]  do_one_initcall+0x50/0x1b0
[    4.522605]  do_init_module+0x5c/0x254
[    4.522605]  load_module+0x21cc/0x2820
[    4.522606]  __do_sys_finit_module+0xbc/0x130
[    4.522607]  __arm64_sys_finit_module+0x24/0x30
[    4.522608]  invoke_syscall+0x48/0x114
[    4.522608]  el0_svc_common+0xc4/0xdc
[    4.522609]  do_el0_svc+0x28/0x90
[    4.522610]  el0_svc+0x2c/0x54
[    4.522610]  el0t_64_sync_handler+0x1a4/0x1b0
[    4.522611]  el0t_64_sync+0x198/0x19c
[    4.522633] SMP: stopping secondary CPUs
[    4.522634] Kernel Offset: disabled
[    4.522635] CPU features: 0x00003051,00000846
[    4.522636] Memory Limit: none

Thanks,
Mauro

  reply	other threads:[~2021-08-18 10:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-12  8:02 [PATCH v11 00/11] Add support for Hikey 970 PCIe Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 01/11] phy: HiSilicon: Add driver for Kirin 970 PCIe PHY Mauro Carvalho Chehab
2021-08-17 10:42   ` Vinod Koul
2021-08-18  9:01     ` Mauro Carvalho Chehab
2021-08-18 10:08       ` Mauro Carvalho Chehab [this message]
2021-08-18 10:10       ` Vinod Koul
2021-08-18 10:30         ` Mauro Carvalho Chehab
2021-08-18 10:37           ` Vinod Koul
2021-08-20 13:43             ` [PATCH v13] " Mauro Carvalho Chehab
2021-08-20 23:04               ` kernel test robot
2021-08-18 11:04         ` [PATCH v12] " Mauro Carvalho Chehab
2021-08-18 13:47           ` kernel test robot
2021-09-15 13:11           ` Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 02/11] PCI: kirin: Reorganize the PHY logic inside the driver Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 03/11] PCI: kirin: Add support for a PHY layer Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 04/11] PCI: kirin: Use regmap for APB registers Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 05/11] PCI: kirin: Add support for bridge slot DT schema Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 06/11] PCI: kirin: Add Kirin 970 compatible Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 07/11] PCI: kirin: Add MODULE_* macros Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 08/11] PCI: kirin: Allow building it as a module Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 09/11] PCI: kirin: Add power_off support for Kirin 960 PHY Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 10/11] PCI: kirin: fix poweroff sequence Mauro Carvalho Chehab
2021-08-12  8:02 ` [PATCH v11 11/11] PCI: kirin: Allow removing the driver Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210818120802.3ba5fd59@coco.lan \
    --to=mchehab+huawei@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kishon@ti.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-phy@lists.infradead.org \
    --cc=linuxarm@huawei.com \
    --cc=mani@kernel.org \
    --cc=mauro.chehab@huawei.com \
    --cc=robh@kernel.org \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).