linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Bowler <nbowler@draconx.ca>
To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	regressions@lists.linux.dev
Cc: Peter Xu <peterx@redhat.com>
Subject: PROBLEM: sparc64 random crashes starting w/ Linux 6.1 (regression)
Date: Sat, 28 Jan 2023 21:17:31 -0500	[thread overview]
Message-ID: <CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com> (raw)

Hi,

Starting with Linux 6.1.y, my sparc64 (Sun Ultra 60) system is very
unstable, with userspace processes randomly crashing with all kinds of
different weird errors.  The same problem occurs on 6.2-rc5.  Linux
6.0.y is OK.

Usually, it manifests with ssh connections just suddenly dropping out
like this:

  malloc(): unaligned tcache chunk detected
  Connection to alectrona closed.

but other kinds of failures (random segfaults, bus errors, etc.) are
seen too.

I have not ever seen the kernel itself oops or anything like that, there
are no abnormal kernel log messages of any kind; except for the normal
ones that get printed when processes segfault, like this one:

  [  563.085851] zsh[2073]: segfault at 10 ip 00000000f7a7c09c (rpc
00000000f7a7c0a0) sp 00000000ff8f5e08 error 1 in
libc.so.6[f7960000+1b2000]

I was able to reproduce this fairly reliably by using GNU ddrescue to
dump a disk from the dvd drive -- things usually go awry after a minute
or two.  So I was able to bisect to this commit:

  2e3468778dbe3ec389a10c21a703bb8e5be5cfbc is the first bad commit
  commit 2e3468778dbe3ec389a10c21a703bb8e5be5cfbc
  Author: Peter Xu <peterx@redhat.com>
  Date:   Thu Aug 11 12:13:29 2022 -0400

      mm: remember young/dirty bit for page migrations

This does not revert cleanly on master, but I ran my test on the
immediately preceding commit (0ccf7f168e17: "mm/thp: carry over dirty
bit when thp splits on pmd") extra times and I am unable to get this
one to crash, so reasonably confident in this bisection result...

Let me know if you need any more info!

Thanks,
  Nick

             reply	other threads:[~2023-01-29  2:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-29  2:17 Nick Bowler [this message]
2023-01-29 22:14 ` PROBLEM: sparc64 random crashes starting w/ Linux 6.1 (regression) Peter Xu
2023-01-30  1:36   ` Nick Bowler
2023-01-31  1:46   ` Nick Bowler
2023-02-15 14:49     ` Linux regression tracking (Thorsten Leemhuis)
2023-02-15 15:21       ` Peter Xu
2023-02-16  5:32         ` Nick Bowler
2023-02-16 15:33           ` Peter Xu
2023-01-30  9:37 ` Linux kernel regression tracking (#adding)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com' \
    --to=nbowler@draconx.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).