archive mirror
 help / color / mirror / Atom feed
* PROBLEM: sparc64 random crashes starting w/ Linux 6.1 (regression)
@ 2023-01-29  2:17 Nick Bowler
  2023-01-29 22:14 ` Peter Xu
  2023-01-30  9:37 ` Linux kernel regression tracking (#adding)
  0 siblings, 2 replies; 9+ messages in thread
From: Nick Bowler @ 2023-01-29  2:17 UTC (permalink / raw)
  To: linux-kernel, sparclinux, regressions; +Cc: Peter Xu


Starting with Linux 6.1.y, my sparc64 (Sun Ultra 60) system is very
unstable, with userspace processes randomly crashing with all kinds of
different weird errors.  The same problem occurs on 6.2-rc5.  Linux
6.0.y is OK.

Usually, it manifests with ssh connections just suddenly dropping out
like this:

  malloc(): unaligned tcache chunk detected
  Connection to alectrona closed.

but other kinds of failures (random segfaults, bus errors, etc.) are
seen too.

I have not ever seen the kernel itself oops or anything like that, there
are no abnormal kernel log messages of any kind; except for the normal
ones that get printed when processes segfault, like this one:

  [  563.085851] zsh[2073]: segfault at 10 ip 00000000f7a7c09c (rpc
00000000f7a7c0a0) sp 00000000ff8f5e08 error 1 in[f7960000+1b2000]

I was able to reproduce this fairly reliably by using GNU ddrescue to
dump a disk from the dvd drive -- things usually go awry after a minute
or two.  So I was able to bisect to this commit:

  2e3468778dbe3ec389a10c21a703bb8e5be5cfbc is the first bad commit
  commit 2e3468778dbe3ec389a10c21a703bb8e5be5cfbc
  Author: Peter Xu <>
  Date:   Thu Aug 11 12:13:29 2022 -0400

      mm: remember young/dirty bit for page migrations

This does not revert cleanly on master, but I ran my test on the
immediately preceding commit (0ccf7f168e17: "mm/thp: carry over dirty
bit when thp splits on pmd") extra times and I am unable to get this
one to crash, so reasonably confident in this bisection result...

Let me know if you need any more info!


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-02-16 15:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-29  2:17 PROBLEM: sparc64 random crashes starting w/ Linux 6.1 (regression) Nick Bowler
2023-01-29 22:14 ` Peter Xu
2023-01-30  1:36   ` Nick Bowler
2023-01-31  1:46   ` Nick Bowler
2023-02-15 14:49     ` Linux regression tracking (Thorsten Leemhuis)
2023-02-15 15:21       ` Peter Xu
2023-02-16  5:32         ` Nick Bowler
2023-02-16 15:33           ` Peter Xu
2023-01-30  9:37 ` Linux kernel regression tracking (#adding)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).