linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Marc MERLIN <marc_nouveau@merlins.org>
Cc: nouveau@lists.freedesktop.org,
	Mika Westerberg <mika.westerberg@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux PCI <linux-pci@vger.kernel.org>
Subject: Re: 5.9.11 still hanging 2mn at each boot and looping on nvidia-gpu 0000:01:00.3: PME# enabled (Quadro RTX 4000 Mobile)
Date: Fri, 29 Jan 2021 15:20:32 -0600	[thread overview]
Message-ID: <20210129212032.GA99457@bjorn-Precision-5520> (raw)
In-Reply-To: <20210129005626.GP29348@merlins.org>

On Thu, Jan 28, 2021 at 04:56:26PM -0800, Marc MERLIN wrote:
> On Wed, Jan 27, 2021 at 03:33:00PM -0600, Bjorn Helgaas wrote:
> > Hi Marc, I appreciate your persistence on this.  I am frankly
> > surprised that you've put up with this so long.
>  
> Well, been using linux for 27 years, but also it's not like I have much
> of a choice outside of switching to windows, as tempting as it's getting
> sometimes ;)
> 
> > > after boot, when it gets the right trigger (not sure which ones), it
> > > loops on this evern 2 seconds, mostly forever.
> > > 
> > > I'm not sure if it's nouveau's fault or the kernel's PCI PME's fault, or something else.
> > 
> > IIUC there are basically two problems:
> > 
> >   1) A 2 minute delay during boot
> > Another random thought: is there any chance the boot delay could be
> > related to crypto waiting for entropy?
> 
> So, the 2mn hang went away after I added the nouveau firwmare in initrd.
> The only problem is that the nouveau driver does not give a very good
> clue as to what's going on and what to do.
>
> For comparison the intel iwlwifi driver is very clear about firmware
> it's trying to load, if it can't and what exact firmware you need to
> find on the internet (filename)

I guess you're referring to this in iwl_request_firmware()?

  IWL_ERR(drv, "check git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git\n"); 

How can we fix this in nouveau so we don't have the debug this again?
I don't really know how firmware loading works, but "git grep -A5
request_firmware drivers/gpu/drm/nouveau/" shows that we generally
print something when request_firmware() fails.

But I didn't notice those messages in your logs, so I'm probably
barking up the wrong tree.

> >   2) Some sort of event every 2 seconds that kills your battery life
> > Your machine doesn't sound unusual, and I haven't seen a flood of
> > similar reports, so maybe there's something unusual about your config.
> > But I really don't have any guesses for either one.
> 
> Honestly, there are not too many thinpad P73 running linux out there. I
> wouldn't be surprised if it's only a handful or two.
> 
> > It sounds like v5.5 worked fine and you first noticed the slow boot
> > problem in v5.8.  We *could* try to bisect it, but I know that's a lot
> > of work on your part.
> 
> I've done that in the past, to be honest now that it works after I added
> the firmware that nouveau started needing, and didn't need before, the
> hang at boot is gone for sure.
> The PCI PM wakeup issues on batteries happen sometimes still, but they
> are much more rare now.

So maybe the wakeups are related to having vs not having the nouveau
firmware?  I'm still curious about that, and it smells like a bug to
me, but probably something to do with nouveau where I have no hope of
debugging it.

> > Grasping for any ideas for the boot delay; could you boot with
> > "initcall_debug" and collect your "lsmod" output?  I notice async_tx
> > in some of your logs, but I have no idea what it is.  It's from
> > crypto, so possibly somewhat unusual?
> 
> Is this still neeeded? I think of nouveau does a better job of helping
> the user correct the issue if firmware is missing (I think intel even
> gives a URL in printk), that would probably be what's needed for the
> most part.

Nope, don't bother with this, thanks.

Bjorn

  reply	other threads:[~2021-01-29 21:21 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-04 12:39 [PATCH v2 0/2] PCI: Add missing link delays Mika Westerberg
2019-10-04 12:39 ` [PATCH v2 1/2] PCI: Introduce pcie_wait_for_link_delay() Mika Westerberg
2020-08-08 20:22   ` Marc MERLIN
2020-08-08 20:23     ` Marc MERLIN
2020-08-09 16:31     ` Marc MERLIN
2020-09-06 18:18     ` pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73) Marc MERLIN
2020-09-06 18:26       ` Matthias Andree
2020-09-07 19:14       ` [Nouveau] " Karol Herbst
2020-09-07 20:58         ` Marc MERLIN
2020-09-07 23:51           ` Karol Herbst
2020-09-08  0:29             ` Marc MERLIN
2020-09-13 20:15               ` Marc MERLIN
     [not found]               ` <20200529180315.GA18804@merlins.org>
2020-12-26 11:12                 ` 5.9.11 still hanging 2mn at each boot and looping on nvidia-gpu 0000:01:00.3: PME# enabled (Quadro RTX 4000 Mobile) Marc MERLIN
2020-12-27 18:28                   ` [Nouveau] " Ilia Mirkin
2021-01-27 21:33                   ` Bjorn Helgaas
2021-01-28 20:59                     ` Bjorn Helgaas
2021-01-29  0:56                     ` Marc MERLIN
2021-01-29 21:20                       ` Bjorn Helgaas [this message]
2021-01-30  2:04                         ` Marc MERLIN
2020-12-29 15:51                 ` Marc MERLIN
2020-12-29 16:33                   ` Ilia Mirkin
2019-10-04 12:39 ` [PATCH v2 2/2] PCI: Add missing link delays required by the PCIe spec Mika Westerberg
2019-10-26 14:19   ` Bjorn Helgaas
2019-10-28 11:28     ` Mika Westerberg
2019-10-28 13:42       ` Bjorn Helgaas
2019-10-28 18:06         ` Mika Westerberg
2019-10-28 20:16           ` Bjorn Helgaas
2019-10-29 11:15             ` Mika Westerberg
2019-10-29 20:27               ` Bjorn Helgaas
2019-10-30 11:15                 ` Mika Westerberg
2019-10-31 22:31                   ` Bjorn Helgaas
2019-11-01 11:19                     ` Mika Westerberg
2019-11-05  0:00                       ` Bjorn Helgaas
2019-11-05  9:54                         ` Mika Westerberg
2019-11-05 12:58                           ` Mika Westerberg
2019-11-05 20:01                             ` Bjorn Helgaas
2019-11-06 13:31                               ` Mika Westerberg
2019-11-05 15:00                           ` Bjorn Helgaas
2019-11-05 15:28                             ` Mika Westerberg
2019-11-05 16:10                               ` Bjorn Helgaas
2019-11-06 13:29                                 ` Mika Westerberg
2019-10-29 20:54   ` Bjorn Helgaas
2019-10-30 11:33     ` Mika Westerberg
2019-10-04 12:57 ` [PATCH v2 0/2] PCI: Add missing link delays Matthias Andree
2019-10-04 13:06   ` Mika Westerberg
2019-10-05  7:34     ` Matthias Andree
2019-10-07  9:32       ` Mika Westerberg
2019-10-07 15:15         ` Matthias Andree
2019-10-08  9:05           ` Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210129212032.GA99457@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=marc_nouveau@merlins.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=nouveau@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).