All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Julia Lawall <julia.lawall@inria.fr>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	nicolas.palix@univ-grenoble-alpes.fr,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: problem booting 5.10
Date: Tue, 8 Dec 2020 10:31:49 -0800	[thread overview]
Message-ID: <CAHk-=wi=R7uAoaVK9ewDPdCYDn1i3i19uoOzXEW5Nn8UV-1_AA@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2012081813310.2680@hadrien>

On Tue, Dec 8, 2020 at 9:37 AM Julia Lawall <julia.lawall@inria.fr> wrote:
>
> We have not succeeded to boot 5.10 on our Intel(R) Xeon(R) CPU E7-8870 v4 @
> 2.10GHz server.  Previous versions (eg 4.19 - 5.9) boot fine.  We have
> tried various rcs.

So the problem started with rc1?

Could you try bisecting - even partially? If you do only six
bisections, the number of suspect commits drops from 15k to about 230
- which likely pinpoints the suspect area.

That said, your traces certainly makes me go "Hmm. Some thing broke in
SCSI device scanning", with the primary one being the
wait_for_completion() one - the rest of the stuck processes seem to be
stuck in async_synchronize_cookie_domain() and are presumably waiting
for this kthread that is waiting for the scan to finish.

So I'm adding SCSI people to the cc, just in case they go "Hmm..".

Martin & co - in the next email Julia also quotes

> [   51.355655][    T7] scsi 0:0:14:0: Direct-Access     ATA      ST2000LM015-2E81 SDM1 PQ: 0 ANSI: 6
> Gave up waiting for root file system device.  Common problems:[..]

which seems to be more of the same pattern with the SCSI scanning failure.

Of course, it could be some non-scsi patch that causes this, but.. A
bisect would hopefully clarify.

Leaving the (simplified) backtrace quoted below.

                   Linus

>The backtrace for rc7 is shown below.
>
> [  253.207171][  T979] INFO: task kworker/u321:2:1278 blocked for more than 120 seconds.
> [  253.224089][  T979]       Tainted: G            E     5.10.0-rc7 #3
> [  253.239209][  T979] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  253.256990][  T979] task:kworker/u321:2  state:D stack:    0 pid: 1278 ppid:     2 flags:0x00004000
> [  253.275552][  T979] Workqueue: events_unbound async_run_entry_fn
> [  253.290687][  T979] Call Trace:
> [  253.302491][  T979]  __schedule+0x31e/0x890
> [  253.315353][  T979]  schedule+0x3c/0xa0
> [  253.327688][  T979]  schedule_timeout+0x274/0x310
> [  253.379283][  T979]  wait_for_completion+0x8a/0xf0
> [  253.392327][  T979]  scsi_complete_async_scans+0x107/0x170
> [  253.406115][  T979]  __scsi_add_device+0xf7/0x130
> [  253.418974][  T979]  ata_scsi_scan_host+0x98/0x1c0
> [  253.431948][  T979]  async_run_entry_fn+0x39/0x160
> [  253.444853][  T979]  process_one_work+0x24c/0x490

  parent reply	other threads:[~2020-12-08 18:32 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08 17:37 problem booting 5.10 Julia Lawall
2020-12-08 17:57 ` Julia Lawall
2020-12-08 18:31 ` Linus Torvalds [this message]
2020-12-08 18:37   ` Julia Lawall
2020-12-08 18:59   ` Martin K. Petersen
2020-12-08 19:19     ` Linus Torvalds
2020-12-08 19:29       ` Julia Lawall
2020-12-08 19:47         ` Linus Torvalds
2020-12-08 21:14       ` John Garry
2020-12-08 21:23         ` Linus Torvalds
2020-12-08 21:25           ` Linus Torvalds
2020-12-08 21:33             ` Jens Axboe
2020-12-08 22:40           ` Julia Lawall
2020-12-08 22:47             ` Jens Axboe
2020-12-08 22:52               ` Linus Torvalds
2020-12-08 22:53               ` Martin K. Petersen
2020-12-08 22:56                 ` Linus Torvalds
2020-12-08 23:00                   ` Jens Axboe
2020-12-08 23:00                   ` Martin K. Petersen
2020-12-08 22:51             ` Martin K. Petersen
2020-12-08 23:13               ` John Garry
2020-12-09  8:21                 ` Julia Lawall
2020-12-09 15:44                 ` Julia Lawall
2020-12-09 15:51                   ` John Garry
2020-12-09 18:50                     ` Kashyap Desai
2020-12-09 16:47                   ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wi=R7uAoaVK9ewDPdCYDn1i3i19uoOzXEW5Nn8UV-1_AA@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=jejb@linux.ibm.com \
    --cc=julia.lawall@inria.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=nicolas.palix@univ-grenoble-alpes.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.